The developed world is literally in a state of awe because of the exponential growth of big data and the astounding opportunities it offers governments, institutions, corporations and organizations.
Big data has become seriously big business. Looking at the largest big data revenues is even more impressive when you consider that the total revenue of these companies is 100 times more: IBM $1,368, HP $869, Dell $652, SAP $545, Teradata $518, Oracle $491, SAS Institute $480, Palantir $418, Accenture $415 and PWC $312. Source: Umbel Corp. (Revenue figures in millions of dollars based on 2013 data). These are some of the big names in the business.
- The lack of publically available big or small data is a challenge in Pakistan. Experts believe that once open and public, data can be used to its full and true potential.
- Likewise, the proper use of technology within elections is also a challenge for the political parties. Tech and data savviness can actually help political parties a lot in refining their campaigns and attracting voters.
- Big data has a massive potential which can used across all sectors and institutions with ultimate goal of retrieving maximum benefits.
According to ALC, industry’s leading privately-held direct and digital data marketing services provider, a U.S. based firm founded in 1978, “90 percent of the data in the world has been created in the last two years alone.” The sheer volume of the data is overwhelming, for example, 2.2 million Terabytes of new data is created every day. Making sense of it all would require specialists. It’s literally being “generated at the speed of light, doubling every 40 months,” with unstructured data coming from “multiple sources like search data, social data, web-based behavioral data, mobile data and crowdsourced data. Possibly because of the anonymity of the internet, or simply the opportunity to be someone else for a while, the reliability of data is often in question,” the firm declares in an info-graphic titled evolution of data.
Big data revolution in the making
“The big data revolution is upon us,” states a Harvard Business Review article titled ‘Where Predictive Analytics Is Having the Biggest Impact,’ authors: Jacob LaRiviere, Preston McAfee, Justin Rao, Vijay K. Narayanan and Walter Sun, dated May 25, 2016. “Firms are scrambling to hire a new brand of analysts dubbed ‘data scientists,’ and universities have responded to this demand by introducing data science courses into degrees ranging from computer science to business. Survey-based reports find that firms are currently spending an estimated $36 billion on storage and infrastructure, and that is expected to double by 2020,” the HBR article confirms.
No doubt, there is a huge growth opportunity when it comes to the world of data: collection, storage, management and extraction of valuable insights that help businesses reduce costs, make faster and better business decisions and aid development of new products and services, “the industry itself is undergoing a fundamental transformation,” asserts Jim Baum, a former president and CEO of IBM Netezza, in his blog post titled Smarter Planet or Smarter Computers? “You can see evidence of this everywhere you look. One particularly fascinating example is the ‘battle of the titans’ between Oracle and IBM as each aggressively positions itself for what it believes to be a massive growth opportunity,” he writes. “While both companies have serious technology in hardware and software, it is interesting that IBM seems to have focused their marketing energy on the social and business value of the solutions they can provide and Oracle is more focused on the technology of moving application capability ‘into the silicon.’ One thing is clear, this battle for the business of gaining more value from an increasing pool of data assets is just getting started. Just last week, another giant entered the fray. SAP announced its plans to acquire database and mobile solutions vendor Sybase for nearly $6 billion, adding Sybase’s capability to their growing analytics portfolio,” Baum maintains.
Read more: Social Media and Big Data in Politics
“To me, the most important element of this transformation and the companies that participate in it is the way in which we provide value from the solutions we create. As a result of tremendous innovation, low cost availability of massive compute power, and the low cost of storage and bandwidth, we can create solutions that were previously inconceivable. Solutions that help businesses make decisions based on what will happen in the future, not just based on what happened in the past. It is my belief that the businesses that are most committed to effectively using data assets and the tools to extract insight from them will either become or remain the leaders in their industries. Those that do not will find themselves lost in a haze of non-competitiveness,” Baum concludes.
Pakistan seems to be lost in this haze too. Most of the developing countries, including Pakistan are either struggling to join the big data revolution or are at a very slow pace, badly missing out on the advantages and benefits of this hidden treasure trove.
“As ‘little’ data becomes ‘big’ data, it goes through several phases,” states a report by Federal Trade Commission (FTC) titled ‘Big Data, A Tool for Inclusion or Exclusion? Understanding the Issues,’ January 2016. “The life cycle of big data can be divided into four phases: (1) collection; (2) compilation and consolidation; (3) analysis; and (4) use,” the report affirms.
“Big data analytics can provide numerous opportunities for improvements in society,” the FTC report continues. “In addition to more effectively matching products and services to consumers, big data can create opportunities for low income and underserved communities. For example, big data is helping target educational, credit, healthcare, and employment opportunities to low-income and underserved populations.”
The U.S. economy is data oriented. “The ‘big data’ presidency transcends intelligence-gathering and surveillance, encompassing the White House’s approach on matters from health care to reelection,” reports Nancy Scola in a Washington Post article titled ‘Obama, the ‘big data’ President.’ “A big-data fact sheet the White House put out in March 2012 — upon the launch of its $200 million Big Data Research and Development Initiative — listed more than 85 examples of such efforts across a number of agencies. They include the CyberInfrastructure for Billions of Electronic Records (CI-BER), led in part by the National Archives and the National Science Foundation, and NASA’s Global Earth Observation System of Systems (GEOSS), which the fact sheet described as a ‘collaborative, international effort to share and integrate Earth observation data.’ The Defense Department is putting about $250 million a year into the research and development of such projects — ‘a big bet on big data,’ as the White House called it,” writes Scola.
According to John P. Holdren, director of the White House Office of Science and Technology Policy, the big data initiative promises to “to transform” USA’s “ability to use big data for scientific discovery, environmental and biomedical research, education, and national security.”
Big data has not only transformed the way U.S. government runs its operations, it has also led to data analytics in the electioneering process all over the world. The ‘swing’ voters are identified, their needs are understood and the manifestos are adjusted accordingly. “Social data drove the 2008 presidential elections,” writes Gurbaksh Chahal for Wired in an article titled ‘Election 2016: Marriage of Big Data, Social Data Will Determine the Next President.’ He is the Founder and CEO of RadiumOne. “Big data drove the 2012 election. In 2016 it will be the marriage of the two that will determine the next President of the United States,” he asserts.
Year “2012 showcased the rise of political data science and big data, how campaign managers and pollsters gathered public opinions about each candidate, compared this with state-by-state election data and wove all the pieces into a comprehensive political campaign for each party,” he relates. “What social media was to 2008, data-driven campaigning was in year 2012. Math quants and data crushers were able to disseminate the mounds of campaign data at their fingertips into actionable targeting information to persuade prospective voters across the nation, and within strategic swing states,” he stresses.
Big data is the new buzz word in electioneering in Great Britain too. The entire structure works on the available data of different government and private entities who analyze the voter trends through the metadata. “The British political parties have started to move away from traditional electoral broadcast methods following the successes Obama enjoyed in 2008 and 2012 through heavy investment in digital,” writes Adriana Coppola for the Guardian article ‘Forget the manifesto: big data will win future elections.’ “The two main parties have subsequently hired communications experts integral to the Obama campaigns – Jim Messina for the Conservatives and David Axelrod for Labor – upped spend in social, and can now target people on Twitter via postcode,” she continues.
Read more: Unlocking Big Data for Electioneering
“In order to win, political parties must move from a broadcast approach, where the same message is sent to the entire country, to a much more targeted approach where data can be used to have direct conversations with individuals on the things that matter most to them. While the Conservatives are spending around £100,000 per month on Facebook, users are getting the same banal updates that do little aside from document the electoral campaign. They could be a lot smarter with their messaging to talk about different manifesto points to different people,” she declares.
“Election campaigns used to be about kissing babies and holding town hall debates,” says an Australian ABC TV production titled ‘Social media, the new electioneering frontier.’ “But now, in the digital century, the 2016 campaign is set to be a sophisticated machine with careful social media messages and big data collection,” asserts the program.
“Canadian Prime Minister Justin Trudeau’s successful campaign this year reached 40 percent of Canada’s population through Facebook and Instagram alone, making some say that in this 2016 Australian election, politicians don’t need to get your vote through TV sound bites anymore,” declares Andy Park, a reporter and a guest at the same TV program. He joined the 2016 election campaign trail for 7.30, and on the cusp of the election, he saw inside the machine that is coming for Australia’s vote. “It’s a simple premise with a 21st Century edge: target swing seats, get residents talking with other residents, log the data and send it back to campaign HQ,” he reports.
Another guest quickly adds, “What social media offers is that every single candidate can have a social media profile. They can be on Twitter, they can be on Facebook, they can take pictures of themselves, selfies, they can put up memes, they can put up video and it’s cheap, it’s speedy – you just load it up and it’s there – and it’s a way of reaching hard-to-reach voters,” comments Dr. Andrea Carson, she is a political scientist at University of Melbourne who has studied how politicians engage with their audiences in election campaigns on social media. “Since 2007 to 2013 there’s been a 240 percent increase in politicians using social media sites such as Facebook and Twitter,” she further comments.
A number of analysts believe that the electoral success of Narendra Modi in India is also reflective of him being tech-savvy. Bharatiya Janata Party (BJP) utilized the big data sets to understand the demand of the masses. It helped shape the core issues and media campaign in 2014 elections in India. They managed to raise funds, edit advertisements and engage voters in different regions on the basis of data analysis. “”Modi is perhaps one of the most tech-savvy politicians in the world and certainly the most active in India,” says Amit Sheth, as reported in a CNBC news story titled, ‘How big data has changed India elections,’ he is a professor at Wright State University’s Knowledge Computing Center in Ohio. “The result: a data-driven election campaign not very different from that of U.S. President Obama’s, albeit somewhat smaller in size, scale and perhaps style,” reports Neerja Pawha Jetley for CNBC. “Modi’s unique digital events like ‘Chai pe Charcha’ (Talk over tea) are unprecedented election events that put the political leader directly in touch with people at tea stalls in villages at publicized localities through a combination of satellite, DTH, internet and mobile,” he reports further. Apparently, Arvind Gupta, the BJP master strategist behind Modi’s 3.67 million followers on Twitter, 12 million likes on Facebook and the party’s 68 million page views on Google Plus, developed “customized digital tools based on both commissioned and open source data,” that put them in “direct touch” with their voters.
“In 2012 things started to flatten out,” comments Joe Trippi in TechRepublic.com article titled ‘Election Tech: How social media and big data changed everything, a Q&A with Joe Trippi.’ He is a veteran campaign media master and has built companies and advised some of the biggest names in politics—Ted Kennedy, Walter Mondale, Dick Gephardt—on how to successfully apply cutting-edge media technology for more than three decades. “There was less emphasis on innovation that empowered people, and more innovation focused on data mining and analytics. By 2016 there has been almost nothing new that really matters. Most of the campaigns are just trying to catch up with 2008 and 2012, and most of the focus has moved from empowering people, to frankly, manipulating people in a targeted way,” Trippi insists.
How does the data industry work and where do we stand
The data industry works in different ways. It works for predictive activities such as researches to predict certain attitudes and activities. It can predict prices of commodities. It is also used to predict how some events might influence any action or event. It also deals with ascertaining trends and success or failure of any startup.
The Business Intelligence (BI) tools are actually the medium. They use a number of personally developed algorithms to derive information and conduct predictive analysis. They help take better decisions in governance and management.
These tools and software run through the data streams. For instance, the congestion zone in London is monitored through utilizing the Optical Character Recognition (OCR) technology where a number plate of a car is recorded and its time is gauged through the BI tools. Similarly, a traffic management system has been introduced in Norway through utilizing this technology. It helps in diversions, traffic control and any specific measures during congestion or other emergency situations. The overall working of the BI tools are divided into data segregation; it’s analysis and finally plotting.
On the other hand, there has been limited use of big data in Pakistan. There might be many reasons which include lack of research, lesser technological advancements, lack of rules and regulations or laws. Yet, one of the primary reasons is the unavailability of data, big or small.
Read more: Riding the New Wave of Data Science
The overall understanding of the industry is limited, though there are unlimited opportunities that wait for the populace in terms of jobs and work. An online search reveals that thousands of jobs are available in the U.S. and other countries for data analysts and scientists. The market is open and growing fast. However, the little understanding of the sector in Pakistan might not allow us to grab the full potential that awaits.
As far as Pakistan is concerned, the data is being utilized by different large companies and corporations. “However, the smaller units are unable to fully utilize the industry due to lack of data streams,” claims Dr. Adnan Jabbar, a predictive data analyst and founder of Seek Drug.
The government is trying to introduce the technology and the Punjab Information Technology Board (PITB) is one such example where 80 percent to 90 percent of its work is based on big data, says Dr. Adnan.
One of its programs, Citizen Feedback Monitoring Program (CFMP) is largely based on big data where the government can monitor the feedback through large chunks of data. “They have critical feedback which they plot,” says Dr. Adnan.
The government actually manages large amounts of data. It is already being utilized in Punjab to track its vaccination program on polio, dengue, etc., it also speaks of the potential to utilize it around the country.
In terms of business data, one of the largest sources of big data is the stock exchange, but the data is maintained and accessed by brokers only as it has critical information in it. Overall, the data with the government agencies is a major source. This is the reason that there is almost no use of big data in electioneering in Pakistan.
The big data limitations and electioneering in Pakistan
The political parties in Pakistan have been using social media and other tools of engagement towards voters, yet they have been unable to fully utilize it to their benefit. Similarly, no heed is being paid to the analytics of people’s engagement in the entire process.
There are various limitations with big data in Pakistan. “We lack a lot of essential data about electoral process, whether big or small,” says Hassan Nasir Mirbahar, Team Leader of Democracy Reporting International—a Berlin based research organization which has been working in Pakistan on democratic reforms since 2010. He finds that data is usually unavailable and whatever data is available, is inconsistent which makes the electoral process contentious as well. For example, in 2013 general elections, concerns on the credibility of the elections erupted when problematic results were issued based on conflicting sets of data.
Hassan revealed that Election Commission of Pakistan (ECP) took positive measure of introducing Result Management System (RMS) in 2013 elections. Yet, the ECP reported that the system was not properly utilized by the Returning Officers in many constituencies. Not just results, he added, even delimitation maps were not available in Pakistan. On the other hand, one can find data of primaries in the U.S. elections on the New York Times website.
He added that issuing data and information about all electoral processes, such as results, delimitation and voters’ registration etc., would not only make electoral process transparent but would also improve confidence in the credibility of the elections.
Hassan, who is an elections and democratic governance expert, finds that there is lack of understanding vis-à-vis proper use of technology within elections. For example, there is general inclination towards introducing electronic voting machines. Global experiences and research shows that such technology in election can actually be counterproductive. He believes systems like RMS should be improved and reused in future elections. That’s where technology can play a crucial role, as research and comparative global experiences show. It will be vital that ECP organizes series of pilot testing exercises to further develop a robust RMS.
Overall, the political parties are also not engaged in strategic thinking, and thus they miss out on the opportunity to fully understand their target voters, constituencies and their actual problems. Similarly, there are very limited independent opinion polls in the country, he contends.
Hassan does not see extensive utilization of big data in 2018 general elections. “We will see some usage of big data, yet its full strength would not be utilized in the upcoming elections as there is no such culture in our political parties and other actors” he explains.
Data can only be utilized if it is released. The problem is that our institutions do not release it. “In the absence of latest census data, data from institutions like NADRA cannot help in electoral matters fully but can also be useful to understand information about a number of demographic aspects as the age groups, education and health, etc.,” Hassan adds.
Similarly, the data with the ECP can be utilized to understand issues of voters’ turn-out, women and minority aspects and targeting improvements for future elections accordingly. He feels that since the right to information has become a constitutional right in Pakistan, institutions like ECP and NADRA should be proactively putting out essential data and information in public domain.
On the other hand, Dr. Adnan Jabbar finds that the data is definitely available. “If someone is not giving you data, there are other ways to extract the information through technology,” he says. He believes it becomes a bigger challenge if the data is not released publicly. He, however, also discloses that by utilizing OCR technology, the entire forms of data available with electoral staff can be converted in to big data and it can be utilized to understand the entire demography.
Dr. Adnan has previously compiled publicly scattered data for medical information and provided it on a free platform, Seek Drug.
If data is available, we can have predictive analysis in the health sector. However, he believes that the data can be misused. “As soon human intervention is involved, it becomes vulnerable,” he asserts. He is concerned about the challenges of big data in Pakistan. The latest hacking attempt on a private website Zameen.com left a number of users vulnerable, he says and adds there is an urgent need of an organization and regulations to deal with the breach of privacy.
Former Secretary, Election Commission of Pakistan (ECP) Kanwar Dilshad believes that the big data with ECP and NADRA is not being fully utilized in the public interest. He believes that legislation is necessary to ensure that the data is available to the general public and subsequently informed decisions can be made by the policy makers.
Kanwar also finds that there is severe gap in terms of capacity of political parties in the country and they have no setup to deal with the modern facilities. “The political parties must create proper infrastructure, establish research units and take informed decisions after analyzing big data,” he adds.
However, there is historically a limited use of data in the electoral process. For example, the voter lists are available to political parties and they utilize this data. Moreover, the opinion polls and surveys also help in understanding the choices of general public in the country.
Moreover, many firms use data for their research and studies. “The political parties hire the services of different firms to gauge public sentiment and opinion,” says Faiz Paracha, an independent journalist who has been covering elections in the country. He adds all entities such as PILDAT, SDPI, FAFEN and Alif Ailaan compile different sorts of data and provide the analysis on the basis of this data.
Kanwar finds that though there is a lot of space in the industry, yet the electioneering process has the usage of data. “It must be refined, though,” he adds.
The development sector and big data
The development sector also faces a number of challenges in terms of utilization of big data in Pakistan. It further aggravates the development challenge as the practitioners usually rely on the available data. “Data acquisition is one of major challenges. We tried to acquire data from ‘Benazir Income Support Program’ (BISP), yet the formalities involved were very tough for us to acquire,” says Saman Naz, Data and Evidence Manager at Alif Ailaan—a nonprofit organization working in the field of education in Pakistan. She recommends that the government should adopt an open policy.
On the overall situation, Saman believes the data savviness is lacking but overall the trend is increasing. She has been engaged in data analysis on education for past several years.
Fasieh Mehta, former joint director of E-Government Services at PITB and currently working as senior consultant for Oxford Policy Management (Pvt.) Limited believes that there is a lot of utilization of big data in terms of capacity. “Pakistan is one of the few countries in the world which has a computerized system of citizen database with NADRA,” he adds.
He finds that there are a number of challenges such as the data acquisition. According to him data maintained is not very reliable as it is not regularly updated. He also finds that the government is not open in terms of sharing the data it has. Furthermore, he believes that one of the major challenges is that the departments within the government do not share the data with each other. “The data with livestock or agriculture department could be helpful for many institutions and bodies within the government,” he explains.
Fasieh recommends that the government should focus on establishing a shared common database with the data from all departments and organizations. He said the Punjab government has established monitoring system for many of its departments. He shared that while utilizing big data, the government had managed to review all data about relief activities, flood streams and utilization of different resources. It is all recorded data which provides specific details on all activities of the government.
Fasieh Mehta finds lack of tech awareness as one of the major challenges for common businesses. “However, there is a lot of scope to utilize big data for small businesses and industry,” he adds.
A prominent data scientist, Rana Usman, who works at Logic Powered – an offshore office of a Canadian firm believes that there are not more than 20 companies in Pakistan who are focused at data science. He says there are many firms working on business intelligence and analytics, however they are not focused on data science as a whole.
Rana also finds that there are issues of data stream in Pakistan. He informed that there were around 2,000 variables in American Community Survey which is regularly updated through census, whereas we had our last census in 1998. “The government needs to publicize its data,” recommends Rana.
Big data potential in Pakistan
There is a lot of scope to work on big data in Pakistan. The political parties, parliamentarians and other institutions must be strengthened and their capacity must be enhanced to ensure that they are able to utilize big data ahead of the future elections and carry out their planning accordingly. Similarly, the data available with the telecommunications industry can be utilized for the larger interests of the small industry by sharing specific benefits for the customers. One of the specific interventions can be related to the online industry, which is booming in the country, yet it is unregulated and involves an increased amount of big data.
Big data can also be utilized to deal with crime, policing, taxation, property management and traffic control as easy tools of management by introducing BI tools in the administrative structures of the country. It will not only resolve the current challenges but will also provide easier and reliable solutions to the citizens. Similarly, the small industry and businesses can be engaged through involving the larger corporations and governmental entities to utilize a large amounts of data for increasing their revenues, developing recorded economy and promoting entrepreneurship. Thus, the big data can help Pakistan advance at a much faster pace than the tax managers can even imagine.
Waqas Banoori is an Islamabad based journalist. He is Daniel Pearl-Saleem Shahzad Fellow 2013 under which he worked for five months at Pittsburgh Post-Gazette, Pittsburgh, Pennsylvania, United States.