Data, Data, and Data

This is my first attempt at a blog on #bigdata and thoughts and musings on interesting articles, comments and questions. Follow along and feel free to comment.

This week I’d like to invite conversation on this article about Michelle Obama and the White House’s open ebooks initiative. If we read or skim further down the article, they estimate these ebooks value as $250 mill. Here’s a snippet from the article at how they came up with $250 mill:

3 (years) X 2,000,000 (students) X 22 (avg. reads/year) X $2 (estimate of the value of a read) = $265 million, which was then rounded down to $250 million to be conservative. In terms of those specific numbers:

  • “3” accounts for the three-year period in which the app’s free content can be accessed
  • “22” is based on the American Library Association’s findings that students in grades 1-12 read approximately 22 titles per year
  • “$2/read” is based on the logic that libraries spend around $1 for an e-book that is in circulation. The NY Public Library reports that titles in its First Book universe are more valuable than library e-books, because they can be read simultaneously by a limitless number of readers, so $2 is an estimate.

Some questions I have if there were more data:

  • the data source, how data on avg reads and students is being collected, accuracy
  • what would be a great way to visualize this data?
    • A histogram of students and avg reads over the year showing an upward trend
    • Most popular titles based on topic/theme
    • Demographics of students accessing the app, what kind of mobile device, location (home, school, etc)
    • Most popular titles among grades

These are some thoughts which could go on but best to outline a few ideas and let you comment and share.

More and more data is being woven into articles such as Air B&B in New Orleans. Data is everywhere and we can’t avoid it; does your organization or company have a data strategy and or have conversations about how data is being used?




