Frequently Asked Questions (FAQ)

If you don't find the answer to your question below, please submit your question using our contact form located in the navigation bar above. We want to hear from you!

Topix FAQ

1. What is Topix?

Topix is a state-of-the-art online, cloud-based web application with the goal to help "democratize" and spread the use of automated topic discovery and exploration. Our goal is to support the innovative data analyst and researcher who needs to find and share answers quickly and effectively, by providing a useful bridge between the recent and emerging innovations in academic research and the practical challenges that you face every day.

Read more about Topix.

2. What is Topic Modeling?

Topic Modeling is a special case of machine learning or "text mining," which uses computer programs to automatically discover and extract implicit ("hidden") topics. Topics are defined to be patterns of co-occurring clusters of words from a set of documents. The power of topic modeling is the ability to map documents to any number of topics, along with calculating the probability (e.g., percentage) for each document-to-topic mapping.

Learn about Topic Modeling in our tutorial.

3. Can I use Topix using a mobile device?

Yes, we have designed the site to be accessed from any last-generation mobile device. Our file processor allows you to upload and download documents from your Google Drive, Dropbox, and many other sources. However, be aware that the limitation on the number and size of your document collection is proportional to the memory and processors on your device. (Note: In an upcoming release you will have the option of offloading processing to our cloud computers.)

4. I have heard about Latent Dirichlet Allocation (LDA), but what is it really?

Over the past decade or so there have been major advances in the field of topic modeling, stimulated by the seminal papers "Latent Dirichlet Allocation" (2003) by David Blei, Andrew Ng, and Michael Jordan and "Finding Scientific Topics" (2004) by Thomas Griffiths and Mark Steyvers.

From these papers we learned about Latent Dirichlet Allocation (LDA) as a "generative probabilistic model" that leverages Bayesian probability in order to discover hidden (i.e., "latent") topics. This "generative" model means that for our purpose of discovering latent topics, it is useful to assume that our source documents are assumed to have been generated by mixtures of topics. The topics determine which terms (words) are included in each document.

Learn more about LDA in our tutorial.