Data Sutram processes unstructured data, creates actionable information to be consumed by data scientists
Says Rajit Bhattacharya, CEO of Data Sutram who has faced many challenges in trying to solve the problem of unstructured data. There are 9 lakh data companies in the US while India hardly has nine. Government initiatives are pushing for a better data management system, but it too will take time, he points out.
Neil Banerjee July 23, 2022
MORE IN Interviews
Data, when collected, is unstructured, unrefined and layered and needs to be tagged so that meaningful insights can be gleaned from it. Data scientists are the players who are taking this game to the next level by providing valuable and structured data to companies so that they can enhance their businesses. SME Futures spoke to Rajit Bhattacharya, CEO and Co-Founder of Data Sutram to unravel the multiple aspects of this data business.
Here are the excerpts from the interview:
How is Data Sutram contributing to the data-driven space and what are the challenges that it is facing?
The huge challenge that India’s AI and ML space faces is that whenever you try to build any kind of AI model, getting the right data is always a problem. Because if you don’t train your models and intelligence with the right data, it’s often a case of garbage in, garbage out. That has been one of the most difficult challenges I’ve faced as an analytics engineer at Data Sutram.
This pattern has been observed in even the country’s largest corporations. And the lack of data is a problem because India has only recently experienced an IT revolution in the last 20 years. Following that, we obtained a lot of data, but we were unable to use it. What remains is unstructured data that is largely unusable or directly consumable by humans.
This is where Data Sutram comes in; it functions as an engine that sits on top of all of these unstructured data layers. And it processes, cleans, blends, and creates actionable information that can be consumed correctly by data scientists and analysts across the board who can work and act on it. The main issue that we’re attempting to address is that of unstructured data.
In terms of growth percentage, India has the highest rate of data scientists coming up every year, but when it comes to the number of data companies that exist, the US has over 9 lakh data companies, whereas India has hardly even nine, and that’s where the supply-demand gap comes in, and that’s what we are here to solve.
What are the legal aspects to take care of when acquiring such sensitive data?
We comply with the General Data Protection Regulations (GDPR). We anonymise hashed data and use it in that format. We do not accept any personally identifiable information (PII) from our vendors. We have several customers and clients in the US and European markets who require compliance with these laws.
If I had to pick one of the challenges that we have to face, it would be the lack of maturity in the Indian market. Especially since, even if you create a framework, I believe that there is still a lot of work to be done. Despite the government’s efforts to enact progressive legislation, the market remains ambiguous. There is a lack of understanding about how to shape these laws, about whether to take a western route or a separate route. Which framework model will be implemented? So, there are difficulties like that.
How does Data Sutram help companies in making strategic decisions that impact their business?
We specialise in easing the top line and protecting the bottom line of any company. For instance, if an FMCG company or bank wants to increase their sales, we can map out where their right target audience is; plan out where they should spend their budget; help them to plan out and properly allocate their resources and aid them in identifying the right people at the right time.
Similarly, if an FMCG company wishes to grow, they would want to know which shops to push their products to, where their demand can be increased — that’s where our intelligence comes in — they would also want to know how many sales an agent puts in there, what marketing strategy should they employ there; etc.
Data Sutram assists in the acquisition of data from a strategic standpoint, which aids in understanding the entire planning process from both a budget and resource standpoint. Our current value is in the SME lending space. It is a prominent sector in which the government mandate has pushed financial corporations across the country to make loans to kirana stores and merchants.
We provide these businesses with alternative score cards. Which gives valuable analytics on various factors for a bank to understand them better. For instance, the customer behaviour and analytics in a kirana store or the income performance of the shop itself. This helps a bank in understanding whether to disburse the loan or not, as these shops typically do not have business on their bank accounts and cash books. Financial corporations often struggle to determine the loan amount that they can give to a trusted user, which is where our second application comes in.
You handle banking clients and have helped them in ATM analytics, how does that work?
When it comes to ATMs, the biggest challenge is the downturn period. What is the amount spent, and what percentage of that is spent on credit cards vs debit cards vs cash vs UPI? Understanding the transaction patterns and the volume of those patterns is key. Also, we need to know whether the card users are present, where the cash demand is, and whether people are likely to transact money.
An ATM is planned depending on the audience’s maturity. One factor is the digitalisation in terms of credit card usage. Another is the volume of business done in terms of cash transactions. The third is the scope of business in terms of what kind of shops are established there and how likely are they to accept different kinds of payment methods.
All of these characteristics contribute to the model’s creation.
How is privacy protected while converting the data into insights? Do you maintain any filters to prevent data breaches?
We are responsible for a few levels. Assume there is a person named X and knowing X’s phone number is a violation of their privacy. However, if the data includes factors such as gender, age, or location, for example-male, average, 20-30 years old and living on Park Street, Kolkata. I automatically anonymise all forms of information.
Aside from identification, the data revealed is…that he is the user who is shopping at Shopper Stop; however, I have no name or phone number to breach, identify, or pinpoint. I understand X’s behaviour, but I don’t know who X is; I only know his profile. I have no idea what the name or phone number is. I’m not sure, but that’s the first round of anonymisation that we do.
Aggregation is the second level of data security that we provide. Our intelligence is at the 100m block level. If I look at how consumer behaviour is happening in a location, I’d draw a 100m circle around it and say, “This is how people are behaving.” This way, I’m not directing behaviour at an individual level, which is another type of aggregation that occurs. All of our data partners do not provide us with direct data; instead, we have scripts running in their engines that aggregate the information and send it to us.
At our end, all the information that has been stored is aggregated and anonymised at an individual PII level.
Analytics as a sector is growing in India; what opportunities do MSMEs have as stakeholders in this sector?
The challenge is that while the number of data scientists in this field is growing, analytics as a field in terms of implementation is still lacking.
At the end of the day, even for the largest MNCs and business leaders, the challenge that analytics is facing is that we are not yet able to measure the Return on Investment (ROI) of the impact that analytics is making in everyday life. Banking and finance are very advanced, in my opinion. That’s why, when we first started, one of our first choices was finance, where people understand data and analytics.
Analytics in retail in the US or Europe is very much ahead of India. We are hardly doing anything here. MSMEs can play an important role because it is difficult to measure impact in larger organisations; it takes 3-4 years to measure impact in larger organisations, but not in the MSMEs. In fact, I recall, particularly in Chennai, that even traditional single-store jewellery shops use analytics in their day-to-day operations, whereas this is not the case in Northern India.
Analytics is easier for MSMEs to adapt in terms of implementation. The leaders of these MSMEs must focus and treat analytics as an integral part of their business component, which will take time.
To use an analogy, 20 years ago, websites were nice to have but now they are required. That is how long India takes. It has happened everywhere, and it will happen here as well.
What’s the roadmap for Data Sutram?
For us, the mission is very straightforward. We want to be a global player. Currently, we are working with over 50+ enterprises in India, but the plan is to scale it up and contribute at least 10 per cent of India’s GDP in the next two years by serving the top Fortune 500 companies in India. Giving them data that will have an impact on their business.
That is where we are concentrating our efforts. Eventually, the bigger milestone will be to serve outside of India and expand into the Middle East and Southeast Asia, both of which are large developing markets for unstructured data, and to be able to build on it. The larger goal for Data Sutram is to become an Indian global data player.