Andreas Weigend
Stanford University
Stat 252 and MS&E 238

Data Mining and Electronic Business


Class 5 - HW3, HW4, Prediction Markets


  • Brief Synopsis

    • Discuss HW3 - emphasis on data and metrics
    • Discuss HW4 - emphasis on Data and Metrics
    • Setting up Prediction Markets
      • last week we learned what data we have: this is the best data today as far as "ground truths"
      • we discussed how to predict the success of start-ups
      • we want to elicit richer data than how many blogs there blogging a startup
      • "wisdom of the crowds" summarizes very nicely the knowledge you can get from crowds
      • 5 levels of abstraction: - Prediction markets cut through these 5 levels in a vertical way
        • (1) the data we collect (e.g. transaction data such as Safeway, click data on the web) (lowest level)
        • (2) experiments
          • subjects are aware of the experiment
          • without people knowing (e.g. Amazon, Google)
        • (3) participation, or "active contribution" - the richer the data, the easier to come up with conclusions for e-business
        • (4) interactions between people (social networks)
        • (5) architectures of collaboration; people have effects of working together that transcends what the individual does
      • In prediction markets people buy contracts on prediction markets
      • The more accurately we get to what people believe; the better we are in creating environments which incentivize people to tell us what they know, the easier it is to do data mining down the road.
      • Definition of prediction market (source: Wikipedia; http://en.wikipedia.org/wiki/Prediction_market) : Prediction markets are speculative markets created for the purpose of making predictions. Assets are created whose final cash value is tied to a particular event (e.g., will the next US president be a Republican) or parameter (e.g., total sales next quarter). The current market prices can then be interpreted as predictions of the probability of the event or the expected value of the parameter.
  • Detailed class notes:

Homework 3

#
    • We built a recommender system for del.icio.us
    • We used statistics to drop some nodes: we dropped URLs that get tagged only once. This is a big assumption; we assume that signal>noise if several people’s trajectories go through a website. One tagging=noise; 2=signal.
    • A small % of websites are actually tagged by more than 1 user; the majority are tagged once (see lecture by Bill Tancer). So we actually only have very few other users to look at, and to look at their few URLs from which to recommend.
    • By design, an algorithm that removes noise by focusing on URLs that occur more than once (favoring more popular URLs over less popular URLs) is biasing towards the common.
    • When is this property desired? During discovery you actually don’t want this – you want to discover the new, the different, not just conform with what everyone else is tagging/searching/thinking. If you are looking for individuality, you would NOT use this method of removing noise.
    • What are its potentially negative side effects? "the rich get richer" - those URLs that are already tagged will be tagged more, and there is less uniqueness in the world.
    • To create a bold recommendation system, biased towards uniqueness, timelines, buzz, how would you find the “interesting” users? How do you derive recommendations from their most “interesting” bookmarks? How do you know if a blogger put something important up, or if it was just random and unimportant?
##
*** When there are many bookmarked URLs, we rank the results, based on weighting. We don't just use the fact that they have tagged the URL, while ignoring the specific tags as well as their comments. Rather, we go further: Example weights to use:
      • Similarity of links (entire url, or just domain name?) - heavier weights for users who have more similar tagging history to yours
      • Similarity of tags
      • Similarity of comments
      • Time with respect to that page’s tagging history (can distinguish between early identifiers and those that tag after everyone else has already tagged the page)
      • Similarity of tags chosen by two people who tagged the same page
      • Have they recently been using that tag – relative history of that tag for the person (new tag, or a tag they always use)
      • Lower weights for users who have not been using del.icio.us for some time
      • How should we treat time?
*
** Relative to the user having tagged weigend.com: Now that both users are on weigend.com – should we negatively weight the sites they were at beforehand, and positively weight pages that came afterwards, to see if they changed behavior as result of this site?
        • Relative to current time: Should a URL tagged yesterday be weighted more than a URL from a year ago?
##
*** Metrics: how will you evaluate the recommendation system?
      • First, it must be online and dynamic, that’s why the Netflicks challenge was a grand PR scheme, but clearly no one would win
      • Return frequency
      • Number of recommendations clicked at
      • Subsequent actions, e.g. bookmarking the discovered link (attention gesture)
      • Purchasing (however, user might not click on an insistent recommendation because he actually already has it! That’s why amazon put a ‘yes I already have it’ so they can learn from this)
      • Explicit ratings or free text response / feedback
      • Compare systems to benchmarks – a benchmark means that users don’t know whether they are using benchmark or the tested system. In an A/B test, we show in parallel to groups A and B. Then measure which system does better, based on predefined metrics. Important to do in parallel so as to not be actually noticing confounding factors rather than real trends.
        • Benchmarks for recommender systems:
          • Recent clicks/tag of someone else (conditioned on the domain)
          • Popular things (can condition on topic, to subset domain of things you are interested in)
          • Other sites that have the same words (difficult though; what does it mean for two sites to be the same?)
          • Other recommendation system algorithms
    • Data

      • We did homework 3 to learn Python and actually access the data
      • We got familiar with the most famous tagging system, and understood the limitations
      • We got a feeling of many decisions you must make when you build a model (highlighted in the homework and in class today) – there’s a lot to worry about when you look at data
      • Then must think how to evaluate the model: what does it mean to do well/poorly; this is critically important.
  • >>

    Homework 4 - Developing a consumer confidence index based on search term data

#
**

Definitions


      • Search Term Portfolio: This tool allows you to gauge volume executed searches on specific queries relative to other queries. Terms can be uploaded manually or via csv file. (note: this tool does take 1-2 days to build once terms are uploaded)
      • Search Term Volume: Measures the volume of subject search queries over all search queries in U.S. (note you can set this measure to a specific industry or site)
      • Term Breadth: Unlike search term volume, search term breadth measures the incidence of queries that contain a word or phrase over the incidence of all queries.
    • Whats its purpose? In informal conversation with Prof. Weigend, high level world bank employees stated that roughly half of the economy is driven by consumer confidence index. Therefore, a accurate measure of this value will be very useful. The belief is that a web based solution would be more accurate than mailing a survey to 5000 people. The reason being that we are examining the decision process of what people actually do as opposed to what they say they do.
    • What is the CCI representative of?

      • Stockmarket
      • Unemployment
      • Consumption
        • Amazon
          • Conversion rates
            • What percentage of visitors buy a product after looking at it?
            • What percentage of visitors buy a product after placing it in the shopping cart?

EBAY
          • What percentage of money is spent on new items vs old items?
          • Avg purchase price in categories
          • Unsuccessful offers (reserve price not met)
        • Cars
          • What percentage of cars purchased are new vs used?
        • Housing
        • Food
          • Opentable reservations
            • How many reservations are being made per day?
            • The average of the number of stars assigned (price range of restaurant) to each reservation.
        • Music / Entertainment
          • Movie attendence (box office)
          • Concerts
          • Sporting Events
        • Travel
          • Amount of travel originating in the US.
            • Leisure travel
            • Business travel
          • Percentage of people staying at hotels vs motels
          • On sites that sell tickets and hotel deals are people sorting by price or schedule/location. If sorting by price how far from the cheapest option do they look.
          • Percentage of people taking a shuttle vs limo from the airport.
        • Mobile phone
          • Are people cutting down on usage plans?
          • New phone sales.
        • Couponing / Money Saving
          • Useage of coupons
          • Usage of discount sites
        • Education
          • How many people are applying to graduate school - delayed effect due to long admissions process
        • Secondary effects
          • Traffic to counseling sites
          • Suicide rates
    • How can we evaluate our metric?

      • Compare to movement of the stock market.
        • Is the searching for stock symbols an indicator of consumer confidence? We would need to check why people are looking at then. Are they bullish about the stock or are they worried. However, this is very difficult to do.
          • Maybe google or yahoo can have a pop up when you search for ticker symbols that asks you for your mood.
          • "Sequence of sites" - look at sites that users goto after looking up ticker symbol. Try to predict the mood. Did the user search for "how to sell my house" after looking at the stock symbol? Hitwise has this data.
          • Opinmind.com - website that summarises how people feel about something. Searches blogs and figures how whether they are making positive or negative comments.
        • Is the S&P500 or Dow Jones Index an indicator of consumer confidence
          • Leading and lagging indicators. If they are related who leads who?
          • 500 stocks in S&P500. Would require a correlation matrix of 2500 coefficients. Needs a lot of data points to determine coefficients. We do not have enough data.
        • Being an indicator of stock market is not good enough. Need to focus on what people at the IMF, Goverment and World Bank do with the CCI. Go downstream and understand what they are really using the CCI for and see how our index can improve what they do. Movement of the stock stock market is only one part of the problem.
        • Stock market predictions are on different time scale than the consumer confidence index. The CCI moves at the granularity of months while the stock market moves in seconds.
        • People worry more about having a job than market movement.
        • This shows how hard it is to measure the accuracy of our metric.
    • Data

      • For simplicity we are only looking at intention data (search terms) . We are not considering click data and attention data. Think about how that information can be used to enhance the model.
    • Comment

      • Statistical Overfitting
        • If the model is more complex (more parameters) than what you can estimate from the available data, then the data will fit well. For example, 100 data points fit a polynomial of order 100 really well through the data points, but will move up and down a lot. It should be fit to a lower order polynomial.
        • The training data is diffirent from the data that it will be used to evaluate it. The model will not be retrained. Therefore, an overfitted model will perform poorly. Overfitting will reduce in sample error but increase out of sample error. There are methods to evaluate your model will in sample data.
          • holding back some sample data.
          • cross validation.
          • jackknifing.
  1. Prediction Markets

    • Introduction

      • Why is there so much trading on a daily basis ?
        • People's perception or belief about the value of a stock is different
        • Someone might believe the price is high and sells, someone else might believe the price is too low and buys
      • In stock markets, the prices are discovered as opposed to being dictated or set by an entity
      • The idea of price discovery is incorporated in Information Markets (aka "Prediction Markets", "Decision Markets", "Idea Futures")
      • How do we get people to tell us what they know?
          • Opinion polls, CCI
            • Problem: collapses probabilities to 1 or 0
          • Information market, where people can buy/sell stocks representing probabilities
            • People can then assess probabilities, by exerting their judgement on how much they are "willing to pay" for a certain outcome
  2. Definition of Prediction Markets

    • A prediction market can be compared to a stock market for ideas or information. The market rewards good information whether it comes from elites or the masses.
    • Prediction markets have built a track record of besting pundits and pollsters when it comes to predicting everything from political elections to quarterly sales figures.
    • Prediction markets are speculative markets created for the purpose of making predictions (From Wikipedia)
  3. Prediction Market Keywords

    • Information elicitation
      • People are more precise or granular in the opinion they express by quantifying or giving actual value to their opinion, and this is more powerful than just answering who wins in a poll.
    • Information aggregation
      • Rather than collapsing probabilities, we deal with probabilities
    • Machine learning
      • There is a parallel with machine learning about combining weak learners, boosting. Creating many simple models and then combining them to represent a behavior.
    • Wisdom of the crowds (James Surowiecki)
      • Example : the "cow breeder" where the best estimate for the weight of a cow in a raffle, was given by averaging the individual guess of the crowd that wasn't expert on the field
      • Requirements:
        • Crowd needs to be diverse, so that people are bringing different pieces of information to the table.
          • In the stock market, people have different beliefs, that's why they trade and then the price is discovered
        • Crowd needs to be decentralized, so that no one at the top is dictating the crowd's answer.
          • There are no cues, rules or suggestions that the crowd follows as in focus groups where you might nod or gesture to lead the audience into what you believe or suggest
        • Crowd needs a way of summarizing people's opinions into one collective verdict.
          • In our example of the cow breeder, it would be the average of the weights, in stock markets, the price of a stock.
        • People need to make their decisions individually
          • We get better results if people can express their individual belief rather than expressing what they believe is what is expected from them or what others believe. As opposed to VCs that tend to behave in a follow the lead pattern.
    • Some pointers
  4. Setting up the market - "Considerations for creating a good financial market."

    • Define a specific example

      • The Start-up Company case we considered in class is an example of a prediction market.
        • It follows the definition of prediction markets given above since it creates a market for contracts on anything related to the success or failure of startups.
          • To be clear, a market is just a formal, structured way to bring together the people who want to buy and the people who want to sell a certain item
        • The information that we hope to aggregate from this market is how successful individual startups will be.
      • This contrasts to just using Hitwise data
        • You can get large amounts of search and traffic data from Hitwise, but (as we saw from HW 4) building models can be complex and you have to make many inferences on what you observe. What does it really mean when a site gets 10,000 hits in a day instead of 8,000?
        • Prediction markets allow for simpler models, and it makes sense that a willingness to pay only $10 for a contract paying $100 signals that the public believes in a pretty low chance of success.
    • Data/ Ground Truths

      • What are contractible aspects of startups that are related to their success? (note: you should think of all the items below as being followed by “on [or by] XX date.”)
      • We should try to construct robust quantities that can't be influenced by things not related to the success of the startup.
        • Example: ratio of repeat usage / blog mentions
        • Getting Funding
          • VC – Venture Capital
            • Professor Weigend mentioned that the amount of startups that receive VC funding compared to ones that don't is very small, and therefore this may not be a very representative metric.
          • other types of funding?
        • Exiting (i.e. stops being a start-up)
          • IPO – Initial Public Offering
            • The company “goes public” and starts being traded on a stock exchange (NYSE, Nasdaq, ...)
            • This is the goal of all startups, and therefore is the definition of a successful startup
            • Again, not many startups have IPOs and we have the same problem as with VC funding
          • Acquired by another company
            • Can be further defined by specifying a purchase price
          • Death – shuts down, bankrupcy
        • Company statistics
          • Employee Growth
            • Should be compared to similar companies in the same field
          • Number of Job adds
            • Can get informations from the database of a job ad website (simplyhired, monster?
          • Number of Job Openings
          • The Executive Team
            • Example: “Current CEO not in prison by ...”
            • A lot of this is debatable (is hiring PhDs good or bad?)
          • Number of users / number of customers
          • The successful launch of a product
          • Site traffic
            • Should also be compared to similar companies in the same field
          • Blog mentions
            • Worry about effect of just having a good PR department
            • Can distinguish between good and bad mentions
            • Also look at comments posted in response to blogs
          • Search Terms
            • Collected through a search engine
          • Web Rank
            • Collected through a toolbar users install
            • Examples: Alexa, Comscore, Pagerank
          • Number of Tags or Bookmarks
            • Example: Del.icio.us
          • Number of pics on flickr with tag company name
        • IP
          • Patent granted
          • FDA approval
          • Number of patents applied for
            • Number of claims
          • For all of these, consider taking deltas
    • Others?

      • Retention
        • Example: "From the employees currently employed, 80+% will still be there in 6 months."
        • Problem: some companies fire 10% of company, every year
      • Business Model
      • Pre-trading Market
        • Number of offers for shares
        • Individuals selling the option to buy their shares when the company goes public
      • Burnrate - the rate at which a company uses up its funding before starting to get positive cash flows
      • Location
        • Examples: large city (San Francisco), rent per sq ft
    • Types of Contracts (Wolfers and Zitzewitz, 2004)

      • "Winner-take-all"
        • the contract costs some amount $p and pays off $1 if and only if a specific event occurs, like a particular candidate winning an election
        • the price on an index market represents the mean value of the outcome
      • Index
        • The contract pays $1 for every instance of something, like every percentage point of popular vote won in an election
        • The price on an index market represents the mean value of the outcome
      • Spread
        • Contract pays set amount if number of occurrences (y) is higher than some predetermined value (y*). For instance, pay $2 if candidate gets more than 40% of vote. Bid according to the value of y*, more than X% of vote.
        • The price on a spread represents the median value of y
    • Issues

      • What are the incentives for startups to share info to get a high price?
        • Give good press coverage if they share
          • Can use blogs, digg, etc.
        • But if they're found out giving false info, the startup's rating will drop
    • How to get people to play

      • Learn from Hollywood Stock Exchange
        • set up incentives
        • rank the best players
      • prizes for top ranking individuals
    • Things to Avoid

      • Question bias
      • Lack of a verifiable outcome on the contract to cash out market
        • Example: Does being "in talks" with Amazon could as a customer?
      • Poor or confusing market description
        • Creates loopholes
      • Not tamper resistant
        • The 10 million stock market emails scam example
      • Outcomes are not well defined
      • Unrealistic timeframe of contract
      • Insufficient information available for traders to make a reasonable judgement
  5. References:

    • http://www.weigend.com/Teaching/Stanford/Readings/PredictionMarkets/AEI-Brookings2007.pdf This paper seeks to provide recommendations for changes to law in order to spur the development of prediction markets in order to assist both the private and public sector. Its recommendations focus around a safe harbor for particular groups to support the development and further understanding of prediction markets.
    • Manifest on Prediction Markets published by the luminaires on May 7, 2007 (thanks to Gregor Hochmuth for pointing this out)
    • http://www.weigend.com/Teaching/Stanford/Readings/PredictionMarkets/ChenChuMullenPennock2005.pdf This paper examines the forecast accuracy of a prediction market compared to experts on the topic of the 2003 NFL games. It shows that at the same time point ahead of the game, prediction markets give identically accurate predictions as the experts.
    • http://www.weigend.com/Teaching/Stanford/Readings/PredictionMarkets/HahnTetlock_Brookings2005.pdf
      • This 200-page book (available at Amazon) is a set of articles by different authors that discuss the different aspects of prediction markets, from basic principles to more advanced concepts.
      • The first few chapters introduce information or predictive markets that you can typically get from any introduction to information markets. It talks about how prediction markets can influence public policy by letting the highest bidder actually do something for what he bid for. If the highest bidder succeeds in achieving the goal, he or she makes the price of the contract.
      • One interesting chapter deals with methods for producing incentives to make more people participate in the market as well as incentives for people to release information underlying predictions. Typically, you would imagine that people who have inside information would be reluctant to share this information as it provides them with an edge during the trading process. This chapter discusses several methods to encourage such sharing.
    • http://www.weigend.com/Teaching/Stanford/Readings/PredictionMarkets/Hanson2003.pdf Combinatorial Information Market Design, by Hanson, Robin (2003)
      • Hanson describes information markets and argues why Market Scoring Rules overcome the most common problems in information markets - thin markets and opinion pooling in the thick market case. After introducing market scoring rules, the author looks at several market design issues: how to represent variables to support both conditional and unconditional estimates, how to avoid becoming a money pump from errors in calculating probabilities, and how to ensure that users can cover their bets, without preventing them from using previous bets as collateral for future bets.
    • http://www.weigend.com/Teaching/Stanford/Readings/PredictionMarkets/IEEE2005TechBuzzGame.pdf This article discusses how Yahoo got into the predictive markets game. This was a joint venture between O'Reilly Media and Yahoo! Research. They had as their key research objectives the ff: 1. evaluate the power of prediction markets to forecast high-tech trends 2. test their dynamic parimutuel market system for allocating and pricing shares. An interesting conclusion from this paper is that the pricing mechanism needs to be fool-proof as a couple of 17-year old traders figured out the problems with pricing and took advantage of it, thereby guaranteeing a positive net profit at all times.
    • Kirtland, A. (2006). http://www.weigend.com/Teaching/Stanford/Readings/PredictionMarkets/Kirtland2006.doc Desigining Prediction Markets for End Users, this is annotation of http://www.boxesandarrows.com/view/communicating_c Good current intro with relevant examples.
      • In his online article, Alex Kirtland describes the rules necessary to create a prediction market. The main rules the author gives are:
        • Make people want to play
        • Provide bidding/trading examples carefully
        • Don't make too many comparisons to the stock market but use as a learning
        • Make it simple
        • Provide information that keep the user informed
        • Use contextual help to guide users through the process.
    • http://www.weigend.com/Teaching/Stanford/Readings/PredictionMarkets/Pennock2004.pdf David M. Pennock introduces "DPM - Dynamic pari-mutuel market", a hybrid between a pari-mutuel market (example, bets at horse racing and "Tech Buzz Game", where two or more exhaustive and mutually exclusive outcomes will occur at some time in the future and those that were right distribute the gains that are result of the money lost by the people that were wrong) and Continuous Double Action or CDA (for example the stock market, where orders to buy are constantly matched with orders to sell, and where the participants can secure gains and cut losses at any time they find a party to make the transaction). DPM combines the infinite liquidity and risk-free nature of a pari-mutuel market with the dynamic nature of a CDA.
    • http://www.weigend.com/Teaching/Stanford/Readings/PredictionMarkets/RothColes_Harvard2007MarketDesignIdeas.doc Al Roth and Peter Coles present a list of markets, some of them Internet-based markets with the purpose allowing analysis from a market design point of view. Amongst the markets presented, we can cite:
      • Zorpa.com a UK based internet company that matches lenders with borrowers bypassing traditional banks.
      • Along the same line we find Prosper.com
      • TicketReserve and , market for ticket options for future possible events and spectacles
      • Google's internal prediction market
      • Not online reference available, but the article "Here's an idea: Let everyone have ideas" shows a clever way to put collective intelligence into action in a corporate environment.
    • Galebach, B., Pennock, D., Servan-Schreiber, E., & Wolfers, J. (2004). Prediction Markets: Does Money Matter? //Electronic Markets//. 14:3. The Authors try to determine if there is an informational advantage to using one type of predictive market versus another. They try to make this determination by conducting a comparative analysis of two forms of prediction markets, real money and play money. The events being predicted are NHL football outcomes the results from the two types of prediction markets are then compared to the results of individual human predictions. The predictive power of each type of market is compared using 6 assessment metrics:
      • Mean Absolute Error
      • Root Mean Squared Error
      • Average Quadratic Score
      • Average Logarithmic Score
      • Linear Regression
      • Randomization Test.None of the metrics were able to determine that either market is significantly superior to the other and the authors conclusion is that predictive ability of the two markets and combined probability of the human predictors are indistinguishable from one another.
    • Wolfers, J., & Zitzewitz, E. (2004). Prediction Markets. //Journal of Economic Perspectives//. 18:2, pp 107-126. Wolfers and Zitewitz provide the foundation for understanding prediction markets. Their treatise covers, the types of contracts that can be traded on prediction markets, the knowledge gained from markets that have been trading predictive contracts, talk about the designs of the types of markets, and suggest areas that are in their minds ripe for additional research and/or market making.
      • Types of Contracts:
        • Winner-take-all contract - Contract pays the holder $x if a deicrete event takes place for example, World Bank President Wolfowitz resigns by June 1st. The "price on a winner-take-all market represents the market's expectation of the probablity that an event will occur".
        • Index contract - The contract value is a derivative of a number for example the percentage of the World Bank's governing board that will vote to oust Bank President Wolfowitz. The price paid for a index contract is the "mean value that the market assigns to the outcome."
        • Spread contract - Traders of Spread contracts bid on contracts that state a margin by which an event will occur, such as the number of World Bank board members that vote to oust Wolfowitz minus the number of World Bank board members that vote not to oust him. The price of spread contracts is fixed but the spread adjusts based on traders expectations of the outcome of the underlying event. The spread then is the traders expectations of the median value of the underlying event.
    • Wolfers, J., & Zitzewitz, E. (2005). Prediction Markets in Theory and Practice. Forthcoming, The New Palgrave Dictionary of Economics, 2nd edition. Wolfes and Zitewitz examine the theory of prediction markets using an expected utility framework.
      • The authors outline what they believe are the three key benefits of prediction markets:
        • Information aggregation
        • Truthful revelation of beliefs
        • Information discovery
      • Observations about prediction markets:
        • The prices of contracts on prediction markets respond rapidly to new information.
        • Time series of prices of contracts follow a random walk, which is a necessary and sufficient condition for weak form of market efficiency.
        • There are few opportunities for arbitrage profits as different markets offer prices which are very similar.
        • Attempts at manipulation of prediction markets have largely failed in the past.
        • Forecasts made using information from prediction markets has typically outperformed forecasts using other types of information.
      • Types of predictions that can be made using prediction market data:
        • Election result predictions
        • Government policy predictions
        • War/terror engagement
        • Contingent Markets (Hybrid types):
          • If winner-take-all then index
          • If index then spread.
    • Wolfers, J., & Zitzewitz, E. (2005). Five Open Questions About Prediction Markets. Unpublished.
      • Wolfers and Zitewitz examine the barriers/questions faceing prediction markets which will determine the efficiacy of such markets to be widely used as forecasting, decision-making and risk management tools.
        • Applications of prediction markets:
          • forecasting - HP printer success contracts
          • Risk management - hedge against economic or policital events (not very liquid as of now)
        • Five questions
          • How to attract uninformed traders?
            • Solutions: offering sports betting, subsidization, and exploitation of career concerns.
          • How to trade off interest and contractability?
          • How to limit manipulation?
          • Are markets well calibrated on small probabilities?
          • How to separate correlation from causation?
    • http://www.nature.com/nature/journal/v438/n7070/full/438900a.html
      • This article compares Wikipedia (a bottom-up approach, wisdom of the crowd model), to Encyclopedia Britannica (a top-down approach, age-old) - and shows very interesting (and later controversial) results about the accuracy that can be achieved with a "wisdom of the crowd" model.
      • You may need to be on campus to view it, or work through a proxy of the Lane Library, with SuNET ID access.
      • Enjoy!
>
>