Let me whip out a little Torah here for a moment to explain. When the Israelites fled Egypt, they were pretty stoked about going to the promised land. But then after wandering for a while in the desert, they started to doubt that this whole desert-wandering was worth the trouble. And they began to miss their lives back in Egypt:
"remember the fish we ate in Egypt at no cost—also the cucumbers, melons, leeks, onions and garlic." -Numbers 11:5
Who WOULDN'T miss melons with garlic...wait, I must be reading that wrong.
Well, this fed-up-with-the-desert situation is something data scientists are helping to create rather than alleviate. We've taken on a gatekeeper role. We rightly say that data science is hard and doing it wrong is downright dangerous. This is true. But to be honest, there's also a bit of protectionism to this truth. Data scientists enjoy their rarity. It makes them sexier.
But by saying things like, "Even K means is touchy!!! Don't try this at home!!!" we're simultaneously right and wrong.
The truth is that for businesses data science is not like milk -- as in, why buy the scientist when I can get the science milk for free? (That sounds gross)
No, data science is more like crack. Businesses want more not when it's held entirely out of their reach but when they've been allowed to successfully smoke a little. Then they're hooked. But for some SMBs, the upfront cost of hiring a data scientist before they see value in analytics is too high a cost to commit to (for others, "when you have the opportunity to buy LeBron James, you buy.").
I just got back from the Big Data Innovation Summit in Toronto, and while the typical topics came up in the actual talks (here's how we used Hadoop!), a different set of topics were prevalent in the lobby conversations. Chief concerns among attendees were:
"This all sounds great, but how does my business get started? How would we even use this stuff?"
"I'm feeling pressured to 'do big data' but a lot of the terms and techniques baffle me. And I feel embarassed to ask questions to get 'caught up."'
"And the reading materials are of no help...they all start with me %$&#ing configuring Hadoop on a VM."
"People are moving on from new technologies to even newer technologies, and I'm feeling left behind. Hell, I don't even understand exactly what data science is."
Recently I spoke with someone from an unnamed Fortune 500 who, I shit you not, was part of a team creating an internal knowledge base where "executives who don't understand anything about big data and data science can read plain explanations of things safely without embarrassment." Lordy! How did we get there? Embarrassment minimization as a business objective.
Businessfolk are attending conferences in an attempt to figure out if their business has a big data play -- this is a bit of a precursor and a bit of a substitute to hiring an actual analytics person. They need to identify and fundamentally understand their opportunities before they commit. Especially outside of the tech start-up world where data scientists are being hired just because any self-respecting tech company should have one.
Here's what I propose. The analytics industry should concern itself with these lost souls. Sure, an MBA isn't going to get all of data science, but some basic training around the core practices and techniques would be helpful. Yes, I mean you, K-means!
It's not enough to just tell people "about" this stuff either. "Collaborative filtering is the practice of blah blah blah....go hire someone to do it." I think even managers need an opportunity to prototype a little bit with this stuff (much like how most MBA programs teach restocking point calculations, forecasting, and basic operations research).
The MBA is not likely to do the data science (like operations research) themselves. But when the opportunities to do data science come to them, they'll feel comfortable starting conversations with the pros and specing out projects. They have just enough knowledge to be dangerous. That's a good thing.
That's why I started my other blog. And that's why I've got a book coming out that teaches data science in Excel. It's because we don't win by leaving people behind. We should separate data science education for these folks from functional programming and configuring software. That way they can get a foothold.
Data scientists' job security is bound up in industry perceptions of the value of data science. And while the "promise of a promised land" might hold people for a while, eventually they want the garlic and melon. Or crack...that too. (I get the mixed metaphor award today) Let's go ahead and find ways to get people hooked before they move on to the next fad.