Data Skeptic

The Panel Study of Income Dynamics

Survey Design Working Session

Bot Detection and Dyadic Surveys

Reproducible ESP Testing

A Survey of Data Science Methodologies

Opinion Dynamics Models

Casual Affective Triggers

Conversational Surveys

Do Results Generalize for Privacy and Security Surveys

4 out of 5 Data Scientists Agree

Crowdfunded Board Games

Russian Election Interference Effectiveness

Placement Laundering Fraud

Data Clean Rooms

Dark Patterns in Site Design

Internet Advertising Bureau Media Lab

Your Mouse Reveals Your Gender and Age

Measuring Web Search Behavior

StrategyQA and Big Bench

Ad Blockers Effect on News Consumption

Your Consent is Worth 75 Euros a Year

Automated Email Generation for Targeted Attacks

Tribal Marketing

Nano-targetted Facebook Ads

Debiasing GPT-3 Job Ads

ML Ops in Production

Ad Network Tomography

First Party Tracking Cookies

The Harms of Targeted Weight Loss Ads

Podcast Advertising

Fairness in e-Commerce Search

Fraudulent Amazon Reviewers

Ad Targeting in Amazon Smart Speakers

Adwords with Unknown Budgets

ML Ops Best Practices

Affiliate Marketing Rabbithole

Monetization of Youtube Conspiracy Theorists

User Perceptions of Problematic Ads

Political Digital Advertising Analysis

Fraud Detection in Crowdfunding Campaigns

Artificial Intelligence and Auction Design

Privacy Preference Signals

Neural Architecture Search for CTR Prediction

Algorithmic PPC Management

Data Skeptic: Ad Tech

The Reliability of Mobile Phone Data

Haywire Algorithms

School Reopening Analysis

Modern Data Stacks

Emoji as a Predictor

Polarizing Trends in the Gig Economy

Remote Learning in Applied Engineering

Remote Productivity

Does Remote Learning Work?

Covid-19 Impact on Bicycle Usage

Learning Digital Fabrication Remotely

Remote Software Development

Quantum K-Means

K-Means in Practice

Fair Hierarchical Clustering

Matrix Factorization For k-Means

Breathing K-Means

Power K-Means

Explainable K-Means

Customer Clustering

k-means Image Segmentation

Tracking Elephant Clusters

k-means clustering

Snowflake Essentials

Explainable Climate Science

Energy Forecasting Pipelines

Matrix Profiles in Stumpy

The Great Australian Prediction Project

Water Demand Forecasting

Open Telemetry

Fashion Predictions

Time Series Mini Episodes

Forecasting Motor Vehicle Collision

Deep Learning for Road Traffic Forecasting

Bike Share Demand Forecasting

Forecasting in Supply Chain

Black Friday

Aligning Time Series on Incomparable Spaces

Comparing Time Series with HCTSA

Change Point Detection Algorithms

Time Series for Good

Long Term Time Series Forecasting

Fast and Frugal Time Series Forecasting

Causal Inference in Educational Systems

Boosted Embeddings for Time Series

Change Point Detection in Continuous Integration Systems

Applying k-Nearest Neighbors to Time Series

Ultra Long Time Series


ARiMA is not Sufficient

Comp Engine

Detecting Ransomware

GANs in Finance

Predicting Urban Land Use

Opportunities for Skillful Weather Prediction

Predicting Stock Prices


Translation Automation

Time Series at the Beach

Automatic Identification of Outlier Galaxy Images

Do We Need Deep Learning in Time Series

Detecting Drift

Darts Library for Time Series

Forecasting Principles and Practice

Prequisites for Time Series

Orders of Magnitude

They're Coming for Our Jobs

Pandemic Machine Learning Pitfalls

Flesch Kincaid Readability Tests

Fairness Aware Outlier Detection

Life May be Rare

Social Networks

The QAnon Conspiracy

Benchmarking Vision on Edge vs Cloud

Goodhart's Law in Reinforcement Learning

Video Anomaly Detection

Fault Tolerant Distributed Gradient Descent

Decentralized Information Gathering

Leaderless Consensus

Automatic Summarization


Even Cooperative Chess is Hard

Consecutive Votes in Paxos

Visual Illusions Deceiving Neural Networks

Earthquake Detection with Crowd-sourced Data

Byzantine Fault Tolerant Consensus

Alpha Fold

Arrow's Impossibility Theorem

Face Mask Sentiment Analysis

Counting Briberies in Elections

Sybil Attacks on Federated Learning

Differential Privacy at the US Census

Distributed Consensus

ACID Compliance

National Popular Vote Interstate Compact

Defending the p-value

Retraction Watch

Crowdsourced Expertise

The Spread of Misinformation Online

Consensus Voting

Voting Mechanisms

False Consensus

Fraud Detection in Real Time

Listener Survey Review

Human Computer Interaction and Online Privacy

Authorship Attribution of Lennon McCartney Songs

GANs Can Be Interpretable

Sentiment Preserving Fake Reviews

Interpretability Practitioners

Facial Recognition Auditing

Robust Fit to Nature

Black Boxes Are Not Required

Robustness to Unforeseen Adversarial Attacks

Estimating the Size of Language Acquisition

Interpretable AI in Healthcare

Understanding Neural Networks

Self-Explaining AI

Plastic Bag Bans

Self Driving Cars and Pedestrians

Computer Vision is Not Perfect

Uncertainty Representations

AlphaGo, COVID-19 Contact Tracing and New Data Set

Visualizing Uncertainty

Interpretability Tooling

Shapley Values

Anchors as Explanations

Mathematical Models of Ecological Systems

Adversarial Explanations


Visualization and Interpretability

Interpretable One Shot Learning

Fooling Computer Vision

Algorithmic Fairness


NLP in 2019

The Limits of NLP

Jumpstart Your ML Project

Serverless NLP Model Training

Team Data Science Process

Ancient Text Restoration

ML Ops

Annotator Bias

NLP for Developers

Indigenous American Language Research

Talking to GPT-2

Reproducing Deep Learning Models

What BERT is Not


BERT is Shallow

BERT is Magic

Applied Data Science in Industry

Building the howto100m Video Corpus



Catastrophic Forgetting

Transfer Learning

Facebook Bargaining Bots Invented a Language

Under Resourced Languages

Named Entity Recognition

The Death of a Language

Neural Turing Machines

Data Infrastructure in the Cloud

NCAA Predictions on Spark

The Transformer

Mapping Dialects with Twitter Data

Sentiment Analysis

Attention Primer

Cross-lingual Short-text Matching



Simultaneous Translation at Baidu

Human vs Machine Transcription


Text Mining in R

Recurrent Relational Networks

Text World and Word Embedding Lower Bounds


Authorship Attribution

Very Large Corpora and Zipf's Law

Semantic search at Github

Let's Talk About Natural Language Processing

Data Science Hiring Processes

Holiday Reading - Epicac

Drug Discovery with Machine Learning

Sign Language Recognition

Data Ethics

Escaping the Rabbit Hole

[MINI] Theorem Provers

Automated Fact Checking

[MINI] Single Source of Truth

Detecting Fast Radio Bursts with Deep Learning

Being Bayesian

Modeling Fake News

The Louvain Method for Community Detection

Cultural Cognition of Scientific Consensus

False Discovery Rates

Deep Fakes

Fake News Midterm

Quality Score

The Knowledge Illusion

Click Through Rates

Algorithmic Detection of Fake News

Ant Intelligence

Human Detection of Fake News

Spam Filtering with Naive Bayes

The Spread of Fake News

Fake News

Dev Ops for Data Science

First Order Logic

Blind Spots in Reinforcement Learning

Defending Against Adversarial Attacks

Transfer Learning

Medical Imaging Training Techniques

Kalman Filters

AI in Industry

AI in Games

Game Theory

The Experimental Design of Paranormal Claims

Winograd Schema Challenge

The Imitation Game

Eugene Goostman

The Theory of Formal Languages

The Loebner Prize


The Master Algorithm

The No Free Lunch Theorems

ML at Sloan Kettering Cancer Center

Optimal Decision Making with POMDPs

AI Decision-Making

[MINI] Reinforcement Learning

Evolutionary Computation

[MINI] Markov Decision Processes

Neuroscience Frontiers

Neuroimaging and Big Data

The Agent Model of Artificial Intelligence

Artificial Intelligence, a Podcast Approach

Holiday reading 2017

Complexity and Cryptography

Mercedes Benz Machine Learning Research

[MINI] Parallel Algorithms

Quantum Computing

Azure Databricks

[MINI] Exponential Time Algorithms

P vs NP

[MINI] Sudoku \in NP

The Computational Complexity of Machine Learning

[MINI] Turing Machines

The Complexity of Learning Neural Networks

[MINI] Big Oh Analysis

Data science tools and other announcements from Ignite

Generative AI for Content Creation

[MINI] One Shot Learning

Recommender Systems Live from FARCON 2017

[MINI] Long Short Term Memory

Zillow Zestimate

Cardiologist Level Arrhythmia Detection with CNNs

[MINI] Recurrent Neural Networks

Project Common Voice

[MINI] Bayesian Belief Networks


[MINI] Conditional Independence

Estimating Sheep Pain with Facial Recognition


[MINI] The Vanishing Gradient

Doctor AI

[MINI] Activation Functions

MS Build 2017

[MINI] Max-pooling

Unsupervised Depth Perception

[MINI] Convolutional Neural Networks

Multi-Agent Diverse Generative Adversarial Networks

[MINI] Generative Adversarial Networks

Opinion Polls for Presidential Elections



[MINI] Backpropagation

Data Science at Patreon

[MINI] Feed Forward Neural Networks

Reinventing Sponsored Search Auctions

[MINI] The Perceptron

The Data Refuge Project

[MINI] Automated Feature Engineering

Big Data Tools and Trends

[MINI] Primer on Deep Learning

Data Provenance and Reproducibility with Pachyderm

[MINI] Logistic Regression on Audio Data

Studying Competition and Gender Through Chess

[MINI] Dropout

The Police Data and the Data Driven Justice Initiatives

The Library Problem

2016 Holiday Special

[MINI] Entropy

MS Connect Conference

Causal Impact

[MINI] The Bootstrap

[MINI] Gini Coefficients

Unstructured Data for Finance

[MINI] AdaBoost

Stealing Models from the Cloud

[MINI] Calculating Feature Importance

NYC Bike Share Rebalancing

[MINI] Random Forest

Election Predictions

[MINI] F1 Score

Urban Congestion

[MINI] Heteroskedasticity


[MINI] Paxos

Trusting Machine Learning Models with LIME


Machine Learning on Images with Noisy Human-centric Labels

[MINI] Survival Analysis

Predictive Models on Random Data

[MINI] Receiver Operating Characteristic (ROC) Curve

Multiple Comparisons and Conversion Optimization

[MINI] Leakage

Predictive Policing

[MINI] The CAP Theorem

Detecting Terrorists with Facial Recognition?

[MINI] Goodhart's Law

Data Science at eHarmony

[MINI] Stationarity and Differencing


[MINI] Bargaining


[MINI] Auto-correlative functions and correlograms

Early Identification of Violent Criminal Gang Members

[MINI] Fractional Factorial Design

Machine Learning Done Wrong


[MINI] The Elbow Method

Too Good to be True

[MINI] R-squared

Models of Mental Simulation

[MINI] Multiple Regression

Scientific Studies of People's Relationship to Music

[MINI] k-d trees

Auditing Algorithms

[MINI] The Bonferroni Correction

Detecting Pseudo-profound BS

[MINI] Gradient Descent

Let's Kill the Word Cloud

2015 Holiday Special

Wikipedia Revision Scoring as a Service

[MINI] Term Frequency - Inverse Document Frequency

The Hunt for Vulcan

[MINI] The Accuracy Paradox

Neuroscience from a Data Scientist's Perspective

[MINI] Bias Variance Tradeoff

Big Data Doesn't Exist

[MINI] Covariance and Correlation

Bayesian A/B Testing

[MINI] The Central Limit Theorem

Accessible Technology

[MINI] Multi-armed Bandit Problems

Shakespeare, Abiogenesis, and Exoplanets

[MINI] Sample Sizes

The Model Complexity Myth

[MINI] Distance Measures


[MINI] Structured and Unstructured Data

Measuring the Influence of Fashion Designers

[MINI] PageRank

Data Science at Work in LA County

[MINI] k-Nearest Neighbors


[MINI] MapReduce

Genetically Engineered Food and Trends in Herbicide Usage

[MINI] The Curse of Dimensionality

Video Game Analytics

[MINI] Anscombe's Quartet

Proposing Annoyance Mining

Preserving History at Cyark

[MINI] A Critical Examination of a Study of Marriage by Political Affiliation

Detecting Cheating in Chess

[MINI] z-scores

Using Data to Help Those in Crisis

The Ghost in the MP3

Data Fest 2015

[MINI] Cornbread and Overdispersion

[MINI] Natural Language Processing

Computer-based Personality Judgments

[MINI] Markov Chain Monte Carlo

[MINI] Markov Chains

Oceanography and Data Science

[MINI] Ordinary Least Squares Regression

NYC Speed Camera Analysis with Tim Schmeier

[MINI] k-means clustering

Shadow Profiles on Social Networks

[MINI] The Chi-Squared Test

Mapping Reddit Topics with Randy Olson

[MINI] Partially Observable State Spaces

Easily Fooling Deep Neural Networks

[MINI] Data Provenance

Doubtful News, Geology, Investigating Paranormal Groups, and Thinking Scientifically with Sharon Hill

[MINI] Belief in Santa

Economic Modeling and Prediction, Charitable Giving, and a Follow Up with Peter Backus

[MINI] The Battle of the Sexes

The Science of Online Data at Plenty of Fish with Thomas Levi

[MINI] The Girlfriend Equation

The Secret and the Global Consciousness Project with Alex Boklin

[MINI] Monkeys on Typewriters

Mining the Social Web with Matthew Russell

[MINI] Is the Internet Secure?

Practicing and Communicating Data Science with Jeff Stanton

[MINI] The T-Test

Data Myths with Karl Mamer

Contest Announcement

[MINI] Selection Bias

[MINI] Confidence Intervals

[MINI] Value of Information

Game Science Dice with Louis Zocchi

Data Science at ZestFinance with Marick Sinay

[MINI] Decision Tree Learning

Jackson Pollock Authentication Analysis with Kate Jones-Smith

[MINI] Noise!!

Guerilla Skepticism on Wikipedia with Susan Gerbic

[MINI] Ant Colony Optimization

Data in Healthcare IT with Shahid Shah

[MINI] Cross Validation

Streetlight Outage and Crime Rate Analysis with Zach Seeskin

[MINI] Experimental Design

The Right (big data) Tool for the Job with Jay Shankar

[MINI] Bayesian Updating

Personalized Medicine with Niki Athanasiadou

[MINI] p-values

Advertising Attribution with Nathan Janos

[MINI] type i / type ii errors