• "The most notable “discovery” in the dataset was that if you simply plotted the number of steps versus the BMI, you would see an image of a gorilla waving at you (Fig. 1b). While we teach our students the benefits of visualization, answering the specific hypothesis-driven questions did not require plotting the data. We found that very often, the students driven by specific hypotheses skipped this simple step towards a broader exploration of the data. In fact, overall, students without a specific hypothesis were almost five times more likely to discover the gorilla when analyzing this dataset (odds ratio = 4.8, P = 0.034, N = 33, Fisher’s exact test; Fig. 1c). At least in this setting, the hypothesis indeed turned out to be a significant liability."
  • "pv – Pipe Viewer – is a terminal-based tool for monitoring the progress of data through a pipeline. It can be inserted into any normal pipeline between two processes to give a visual indication of how quickly data is passing through, how long it has taken, how near to completion it is, and an estimate of how long it will be until completion." Looks very handy.
  • "xsv is a command line program for indexing, slicing, analyzing, splitting and joining CSV files. Commands should be simple, fast and composable." iiinteresting.
  • "Sheetsee.js is a JavaScript library, or box of goodies, if you will, that makes it easy to use a Google Spreadsheet as the database feeding the tables, charts and maps on a website. Once set up, any changes to the spreadsheet will auto-saved by Google and be live on your site when a visitor refreshes the page." This is good.
  • "All it takes to get a website going for a repository on GitHub is a branch named gh-pages containing web files. You also don’t need a master branch, you can have a repo with just one branch named gh-pages. Here is what I think is really cool, if you fork a project with just a gh-pages branch, you’re only a commit away from having a live version yourself. If this repo being forked is using sheetsee.js then everyone is a fork, commit and spreadsheet away from having a live website connected to an easy (a familiar spreadsheet UI and no ‘publish’ flow because Google autosaves) to use database that they manage (control permissions, review revision history)." Very smart.
  • Hosted statistics tool with attractive interface and smart API. Not cheap for its single-tier plan ($99/mo), but looks like it might be worth a poke.