Ana Sousa is a Data Scientist at Cxense, a data management platform that offers online advertising, analytics, conversion rate optimization and content recommendations for publishers.
Econsultancy caught up with her to find out what a data scientist really does, the skills she uses in her role, why her job was “love at first sight”, and the data-driven companies she most admires.
Please describe your job: What do you do?
Ana Sousa: I help implement data science projects, either within Cxense or together with a customer. I regularly talk with experts on the customer side and explain the models and techniques we have. But I also liaise with people who are less technically proficient, so part of my job is also translating technology into an easy-to-understand language.
A common misconception with my job is that we deal with data infrastructure or data collection and quality, which is not the case. We start when the data is ready to be used.
Where do you sit within the organisation? Who do you report to?
Ana Sousa: I report to the head of the data science team and he, in turn, reports directly to the CTO. I’m leading one of our data science products, which also involves coordinating and leading the people who work for that product.
What skills do you need to be effective in your role?
Ana Sousa: One thing that is extremely important these days is being able to explain what a model is trying to achieve, what measures you are using so you can trust the model. Clear communication is very important for people who don’t understand all the technical details.
But, obviously, being able to code and having intuition about various machine learning techniques, manipulating data and the features that are going to define a project – those are pretty core.
Most of the people moving in this direction have a background in computer science, statistics and maths. People who are more into data analytics would want to get better at coding and people with a software development background would need to become stronger in their understanding of machine learning. My background is in maths and statistics.
Tell us about a typical working day…
Ana Sousa: There aren’t many days that are the same in this job! Most of my time is spent coding for models, putting models into production or troubleshooting. Then, there’s a lot of interaction with other colleagues as we discuss data and strategy, or indeed with the customer themselves.
Learning is an important part of it. In my team, we often have data science workshops and spend time reading through published articles from the scientific community to see how we can experiment with our ideas. We do have a lot of freedom for experimentation. I like to work that way even if the code isn’t 100% perfect in the beginning. But we experiment and keep improving it.
What do you love about your job – and what sucks?
Ana Sousa: I have massive amounts of data available at my fingertips and there are many interesting problems to explore. I end up learning a lot which is hugely important for me. I also love that freedom to experiment that I mentioned earlier.
Not having all the data we would like, or when it gets difficult to scale the codes, can be frustrating. Sometimes a model doesn’t produce the results we wanted and then we have to spend a lot of time going back and forth. These frustrating parts help us get better of course – but I guess it depends on just how much frustration!
What goals do you have? What are the most useful metrics and KPIs for measuring success?
Ana Sousa: In machine learning we have all these KPIs in place to help us decide if something is a good model. But, at the end of the day, the model is only going to be good if it solves a real problem. And the only way to know is to see it live and see if it’s fit for purpose.
There are a few other things I consider important. I’m quite communicative and a people person so I want to keep sharing and spreading knowledge about data science and inspiring others, especially those that say data science is too hard and complex! On the other hand, I like the feeling that I’m doing something that’s technical but that also contributed to making the world a better place somehow.
What are your favourite tools to get the job done?
Ana Sousa: If I’m going to go technical, it would have to be Python. But besides that, I spend a lot of time on Medium reading articles. I also do a bunch of courses online, on the Coursera platform, looking at different fields, not just data science. It’s powered by universities from all over the world and that’s a cool way to learn new things. Also, technical books are my thing, the O’Reilly editions in particular.
How did you end up at Cxense, and where might you go from here?
Ana Sousa: I was found by a head-hunter and it was love at first sight! Even though there are many data science roles out there, there aren’t so many where data is actually the core of the business – or where the data is really structured and ready to use. On top of that, it’s a smaller company, which gives me the freedom to go from developing the model to using it live.
Where next? I really love the stories that data can tell, and I want to keep developing in the field. Perhaps to spread knowledge and inspiration, using it to contribute to a better world.
Which data-driven companies do you admire?
Ana Sousa: Three immediately come to mind. The first is Datakind. It’s exactly the sort of company that thinks about contributing towards a better world. Its aim is to serve humanity and the planet, and I find that really cool.
In terms of more data science techniques, I like Airbnb and LinkedIn. They have really great products and really organic growth and have also contributed quite a lot to the community. Airbnb in particular publishes a lot on its blog and has a lot of open source code. LinkedIn, on the other hand, is just really, really smart at matching people up.
How can companies become more data literate?
Ana Sousa: It comes with education. It’s really important for me that education is a continuous thing, tailored to the business context of the company. Because, even if people aren’t working in data science day to day, they can still translate it for the rest of the organisation.