www.oce.nysed.gov / HQ: Albany, NY
www.highered.nysed.gov / HQ: Albany, NY
How would you work out how many queries on Bing are related to Music and Entertainment?
Think of easy ways to solve the problem... Scrape lists of artists from sources (Wikipedia etc), use reg-ex etc. Think about tools (SQL Server etc)
initially a seed of words are created from searching WIKI pages. These terms are searched across the internet pages. the pages which have many occurrences in these terms, the other important terms in these pages are also considered as related to music and entertainment. Also the list of words found are also related to music. This cycle is continued further. Whenever a search query has more than a particular threshhold of music related terms in the search string are music and entertainment related. also, if someone spends a lot of time on a particular page after doing a music search, that page is also indexed to get music related words from it.