Datascience in Towards Data Science on Medium,

Why a Data Scientist with a PhD Might Kill Your Early Stage Startup

11/27/2024 Jesus Santana

Why a Data Scientist with a PhD Might Kill Your Early-Stage Startup

Look beyond the degree and hire smarter by prioritizing these five traits instead

I’m a Data Scientist/ ML Engineer without a STEM degree or PhD but with six years of experience in tech.

Despite not having a PhD or quantitative degree, I’ve managed to drive enormous impact. I’ve:

Developed ML credit models that have disbursed over US$900M,
Scaled a new market launch to over 2 million customers in only two years,
Led a team that managed ML credit and fraud models for seven different countries across the globe.

I think it is a myth that only PhDs can be great Data Scientists.

If college dropouts can be talented software engineers, why can’t non-PhDs with experience be great Data Scientists?

I believe that you don’t need a PhD to solve problems, especially at early-stage startups.

In fact, it could end up being a costly mistake for early-stage startups — spending valuable time and effort hiring for qualifications you don’t need yet.

I’ve learned the key traits that are important to succeed as a data scientist, and I’ll reveal them to help you make better hiring decisions.

Here are five traits to look for during the recruiting process.

Focus on speed over perfection

Early-stage startups need people who can move fast.

You need someone who can iterate quickly, test ideas and pivot when needed.

This is fundamentally different from academia that values:

Deep research,
Perfectionism,
Meticulous optimization.

These skills are invaluable at mature big tech companies like Meta. These companies have mature systems in place and can afford for scientists to spend months of research on small optimizations.

In contrast, early-stage startups benefit more from individuals who can quickly prototype solutions to get from zero to one. The focus should be on speed rather than perfection.

I believe in the 80/20 rule — identifying high-leverage solutions that require 20% of the effort to get you 80% of the results.

Hire someone with extensive experience and a track record of:

Moving fast,
Implementing data science solutions,
Driving huge business impact across the organization.

Someone who can get you 80% of the way there with 20% of the effort and time.

Then, after getting something in place that buys you more time and stability, by all means hire PhDs (or non-PhDs with the expertise and experience) to close the gap and get you closer to 100%.

Chase speed, not perfection.

Solutions to real-world problems

The world of software engineering values real-world experience over academic experience.

Think of the college dropouts who are glorified as prodigies, like Mark Zuckerberg, Bill Gates or Steve Jobs.

These stories of self-taught prodigies dominate the narrative of software engineers. Now, can you think of any prominent engineers in tech with a PhD?

It’s not that PhDs aren’t valuable — big tech companies like Meta and Amazon hire PhDs as software engineers in highly specialized areas, but I have yet to come across one at startups.

I believe the world of data science is no different.

Early-stage companies should prioritize candidates with hands-on experience applying data science and machine learning in the real-world rather than in academic and theory-driven settings.

What works in a confined, academic setting is unlikely to work in the real-world where data is messy and noisy.

It takes:

Creativity,
Pragmatism,
Domain knowledge

To clean and wrangle real-world data and make it useful for decision-making or prediction.

Think of the differences between working on a Kaggle dataset versus real-world data at a company. Kaggle datasets are highly structured and designed to optimize a single metric in a well-defined problem space.

While Kaggle exercises are valuable for learning algorithms and techniques, they don’t prepare you for the challenges in production environments.

The first time I built and implemented a machine learning model in production, I learned many valuable lessons that I didn’t know about even after working on dozens of Kaggle datasets.

Over years of deploying machine learning models in production environments, I’ve encountered many unexpected obstacles, including:

Data pipelines breaking,
Feature drift that degraded model performance,
Having to explain model predictions to non-technical stakeholders.

Overcoming these challenges taught me lessons that no amount of theory or Kaggle competitions could.

Early-stage startups cannot afford the luxury of time to develop the perfect solution, but need pragmatic solutions that deliver value now.

This is why I believe a data scientist with real-world experience wrestling with noisy data and has learned from production failures is often a better fit than one with a purely academic background.

Diverse skillsets

The first data science hire needs to be comfortable with ambiguity.

There won’t be any processes, structured workflows or established standards in place.

Instead, this person will have to define what data science should look like at the company, which often involves wearing multiple hats to get things off the ground.

Ideally, this should be someone with a broad range of skills and knowledge in multiple areas in addition to data science, including:

Product analytics,
Product management,
Software engineering.

They should have hands-on experience working with various tools and tech stack, so they can implement solutions and contribute directly to production systems.

They need to be both strategic and tactical — they should be able to:

Identify problem areas,
Roll up their sleeves to come up with pragmatic data science solutions,
Work with different stakeholders to implement these solutions.

In my experience as the first data scientist at a startup, I found myself stretched in ways I hadn’t anticipated.

I quickly learned that machine learning was not the most practical solution to every problem, and that’s where having diverse skills helped me thrive.

There were times when I:

Dove into customer behavior analysis to identify patterns and recommend actionable insights.
Debugged production issues in collaboration with software engineers to ensure integrity of our data pipelines and model outputs.
Collaborated with product managers to brainstorm and implement creative strategies for tackling fraud.

These experiences not only strengthened my abilities and skills in adjacent fields (e.g. data analytics, software engineering, data engineering etc.) but also highlighted how important cross-functional collaboration is for a first data science hire.

The first hire plays a crucial role in shaping the future of the data science team at a company. This person will help to define the processes, tools, and culture that will guide the company’s data initiatives.

Hiring your first data scientist with a diverse skillset will give more flexibility on project scope and areas to apply data science in.

It will also increase chances of success in implementing your first data science solutions in production.

Hiring a versatile data scientist with a diverse skill set who can step into multiple roles can lay the groundwork for a strong data science function and deliver on pragmatic data science solutions that help the company grow and scale.

Communication and collaboration

My experience as a founding data scientist highlighted to me the importance of being able to communicate effectively across different functions.

The engineers, product managers and leadership teams at the company may or may not have interfaced with a data scientist before.

So the first data science hire must have strong experience working with diverse teams in order to drive and execute on projects effectively.

A PhD with deep technical expertise but little experience working on cross-functional teams may not be effective at pushing projects forward, even if they:

Can develop a complex algorithm,
Have deep knowledge of extensive statistical theories,
Can build custom objective functions to optimize model performance.

What counts more is someone who can:

Recommend practical solutions that can make an impact in production,
Translate technical concepts into actionable insights to cross-functional teams,
Build buy-in and ensure alignment across the organization to push their solution forward.

Unlike writing a PhD dissertation, no one works in a silo in the real-world.

Make sure to hire someone who can effectively communicate technical concepts to business stakeholders.

Cultural fit

Every startup I’ve worked at has described its culture as:

Innovative,
Fast-paced,
Collaborative.

Early-stage startups are often more informal, less structured and flexible compared to large companies and academia.

So, startups should be hiring for individuals that can thrive in this type of environment through open communication and close collaboration.

PhDs, especially those coming straight from academia, may struggle to adapt to the chaotic and less structured nature of startup life.

Every hiring manager should always ask themselves:

How well can a candidate adapt to startup life?
How and where would they add value to the team?
Can you picture this person contributing to the culture in a positive way?

What does this mean?

I’m not saying don’t hire PhDs. I have nothing against PhDs. In fact, I’ve worked with some brilliant colleagues who have PhDs.

Rather, I’m highlighting the important characteristics that make data scientists successful at early-stage startups from my experience.

I’ve seen hiring managers fawn over a candidate for the sake of having a PhD, but they ultimately failed to make an impact.

One of the reasons was because they went too deep into technical details and couldn’t communicate with other teams to push their project forward.

What I’m preaching are the important traits that you should be looking for:

Speed over perfection,

Real-world problem solving skills,

Diverse skillsets and experiences,

Cross-functional collaboration and communication skills,

Cultural fit.

Look beyond the prestige of a PhD degree and learn to hire for the right skills that will move your startup forward.

Do you believe college dropouts can be talented engineers?

Then start believing that non-PhDs with real-world experience can be great data scientists.

Are you a founder or functional lead at an early-stage startup startup dealing with fraud?

I’ve put together a free 5-day email course on how to harness the power of data science to combat fraud. It’s a hands-on and practical guide tailored for the fast-paced world of startups based on my experience. 🚀

👉 Sign up here and start building smarter fraud detection solutions today.

https://ds-claudia.kit.com/b85685a8e2

Why a Data Scientist with a PhD Might Kill Your Early Stage Startup was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.

from Datascience in Towards Data Science on Medium https://ift.tt/Wg3B6Zy
via IFTTT

Why a Data Scientist with a PhD Might Kill Your Early Stage Startup

11/27/2024 Jesus Santana