Suppose that you work for AT&T, which runs customer discussion groups on its website. There are active discussion happening simultaneously – too many for the company to monitor them all.

    1. How can the company get a general understanding of what is being discussed, and how it changes from week to week?Describe your text mining solution including choices of text preprocessing and data mining techniques (e.g., association rule, k-means, decision tree, etc.).

 

  1. Each discussion page has slots for two ads. The company would like to select ads that are good match to the page. Assume that there are many ads. How are the best two ads for this web page are selected?  describe your text mining solution including choices of text preprocessing and data mining techniques (e.g., association rule, k-means, decision tree, etc.).

 

After observing the effectiveness of your solution for a while, the company realizes that advertising revenue could be improved if ad selection is tuned differently for people based on their primary interest in using the website. There are five types of primary interest: “phone hardware,” “phone GUI,” “phone apps,” “coverage,” and “price.” For a particular user, how can you use a person’s profiles (e.g., age, gender) and behavior (e.g., posts, comments, reading history) to predict which type of user he or she is? Please describe your text mining solution including choices of text preprocessing and data mining techniques (e.g., association rule, k-means, decision tree, etc.).