Without knowing it, we are all data scientists. We try to forecast whether it is going to rain. When you buy a carton of milk that is going to expire tomorrow you try to predict whether it is actually going to go bad. When you see everyone wearing a certain color, you try to find out why. And of course, you want to know the best time to text your boyfriend or girlfriend so that he or she actually responds. All these scenarios are what data scientists work with.
Data provides valuable insight into human behavior. This makes it a gold mine for businesses in making informed strategic decisions. Data scientists analyse huge sets of data and see patterns where others see nothing. With the help of modern technology, data is becoming more accessible hence making data science one of the most popular jobs of the 21st century.
Are you curious about everyday scenarios? Does hunting for and analysing data sound like fun? Are you looking for a way to combine your love of programming and statistics and business analytics? In this post, we talked to data scientist, Elvis Bando of Farm Drive about his experiences as a data scientist.
Years of experience: 8+
Company: Farm Drive
Education: BSc, Software engineering
Career path: I have previously worked as a data consultant working with a number of organisations on how to build data. I have also worked with M-Farm where I built models to help advise farmers on what time to plant, what will do well, etc. Currently, I am a data scientist at Farm drive which provides alternative credit scoring to small holder farmers. We use mobile phones, alternative data and machine learning to close the critical data gap that prevents financial institutions from lending to credit worthy smallholder farmers. My work involves trying to understand the risks that farmers are exposed to. I develop predictive models to understand how farmers behave under different circumstances.
Why did you choose this field?
It has been a journey for me. A logical progression. I am naturally a very curious person. From a young age, I started with hardware programming which I found to be very manual. Then, I discovered software and I actually did software engineering at university. But I soon realised that software is not so intelligent. A programmer has to simulate different scenarios. I needed something different and I got it through data science, artificial intelligence and machine learning. With them, I didn’t have to program everything. It satisfied my curiosity in a new way. So, I could say the field chose me.
How did you acquire your skills? What was your first job as a data scientist and how did you land it?
I had the hard skills from my software engineering degree. There are a few challenges that led me into data science. I was involved in a project where we wanted to build an answer engine similar to google. The problem was that our solution was human based and quite manual. Searching for a way to automate our solution introduced me to artificial intelligence and machine learning. From there, I did various courses in statistics and this grounded my foundation as a data scientist.
The first data science project I worked on was for Innovation for Poverty Action (IPA), an organisation that through research derives innovation for poor people. It involved providing water dispensers to rural areas. Our task was to understand the iteration for the design to predict which parts of the dispensers were most likely to break.
How does a typical day as a data scientist look like? What tools do you often use and what are your favorite?
It depends on the project but 80% of the time I am doing data cleaning, data preparation, and building data pipelines. There is a lot of trying and testing. I am always asking myself how data sets link. Building models takes a while depending on the complexity. And even when you build a model, you have to tune it so my work involves a lot of reviewing. I use Python and R. They have different strengths depending on the task you want to achieve but I lean more towards R because it has diverse libraries and a strong community of developers.
How can someone expect to get into data science?
Data science might sound hard but it’s not as hard as you imagine. It’s incorporated in our day to day life but we just don’t take notice. Data science is an interesting field and there is no barrier. Anyone looking to get into data science should have intellectual curiosity. If you are curious enough, you can learn the tools that will enable you to get into the field.
A background in statistics is important to help you understand the data (the different levels, presentation, etc) because you need to know what the output of your data means. Almost all fields are ripe for data scientists. If you are already an expert in your field, you can use your knowledge to apply data science and build something that works for you.
There is a misconception that one needs to know how to code in order to become a data scientist. It’s just a tool and should not discourage you. It is just that with the high volume of data you will encounter, some knowledge of code is needed.
What are the specific things that you would look out for when adding someone to your team?
Someone who has experience with different data sets will have an advantage because they will have developed a diverse data skill set and the ability to see things differently. Therefore, anyone interested in getting into data science needs to put in the work. You might have to take on some pro bono work to learn.
What resources are available for young people who want to break into data science?
Kaggle is a good resource. It based on competition. People put diverse data and they compete at building models and share code which is helpful because you get to see how people approach solving problems. They also use industry level data which you will be working on in the real world to solve problems. Using cleaned data to practise might not help much.
I would also recommend joining email lists of data science blogs for the tools that you use. Personally, I use R bloggers. I might not find the time to check websites all the time but if they are sending me emails, I can always keep myself in the know. Things are always changing and you have to keep updated on how to be better.
What is the most rewarding thing about working in data science?
The biggest satisfaction about working in data science is that you build solutions that masses can use. People actually use the models you build. You always see your work being used.
What’s the most interesting data science project you have participated in?
Most of the projects I have worked on are interesting. I am currently doing this credit scoring thing for farmers where I answer lots of questions on how people behave under different circumstances. I measure and look at different things. How will the farmer pay? Will he have a challenge paying back? How will his repayments look like 10 months from now? Predictive modelling is intriguing.
Data science is very hot right now. How in your opinion will big data change Africa?
Africans are just discovering the tools that will helps us get knowledge from the data. Forward looking governments are trying to digitize their data so that they can use data to make decisions. In Kenya, the government releases information and people create tools to analyse the data. Data is being liberalized and masses are being involved.
African organisations are just realizing the power of big data. They now know that data is valuable but they don’t have the capacity to deal with it so they are hoarding data which means they can’t explore what is happening. Organisations need to learn that data science is not just about having data. A super market for example generates a lot of data on its customers and the products its sells. However, it is meta data (data about data) that is important and not the original data that all supermarkets have. For example, when do customers buy? How many times do they buy a particular product? Can we build a connection out of that? Should we put diapers near alcohol for fathers who want to appease moody mothers? Will they buy more?
I see lots of potential for big data in Africa. Today, I am working on satellite imagery, rainfall and building a score for farmers. You never know what is next.