Every day, we create 2.5 quintillion bytes of data. They come in a tidal wave of tweets, Facebook postings, Google searches, emails, texts, website "clicks," app downloads, credit card purchases and the photographs and home videos we post on the Internet.
"It continues to grow at an unprecedented rate," said Mike Schroeck, vice president of business analytics for IBM.
"We've generated more information in the last two years than we have through all of history up to that point." Analytics means a retailer can lure new customers or keep old ones. It means a hospital can track a patient's health and know when he or she might need emergency care. And it means a credit card company can track fraudulent purchases as they happen.
Analytics is not new. But how companies such as IBM gather and analyze data is changing as fast as the Internet continues to grow. Columbus is at the forefront of the field to harness big data. IBM is installing a first-of-its-kind Client Center for Advanced Analytics at its Tuttle Crossing campus that will pair industry experts with computer programmers skilled at scouring the growing ocean of data.
Businesses, government agencies, health-care providers and others will pay for the information.Internet giants Google and Amazon get much of the credit for the evolution of the concept. Both companies wanted to analyze the data left behind when people search the Internet or buy things online. For example, Google uses a process called "data warehousing" to help improve the accuracy and speed of searches.
It's why visitors see a list of likely search options before they are finished typing a query. And Amazon uses it to track what we buy so the company can pitch other products that we might be interested in. Bought some running shoes? Amazon will pitch running socks, shirts and shorts.
"It's called the 'next best offer,' " said Jim Gallo, national director of business analytics for Information Control Corp., a North Side business that offers analytic-type services.Supermarkets use the same tactic, he said.
"Based on what you bought, (they send) a coupon for a product you're likely to buy."But data warehousing is just a start. There is a program called Hadoop that businesses use to tap anything that's produced and collected on the Internet.
It's different from most data analysis software that sorts facts and figures kept in spreadsheets and databases.
That's called "structured data."Hadoop uses a process called "map reduce" to link and analyze data from any and all sources.
For example, tweets and texts that express opinions during a presidential debate can be analyzed and sorted by region, helping political parties decide where to send ads and mailings. In another use, crime statistics can be compared against traffic patterns and employment data to help police departments plan where their officers patrol. This type of sophisticated analysis can be a big job. Look at that number at the top of this story.
We produce 2.5 quintillion bytes of data every day. The key to managing this is spreading queries across a network of computers called servers.
Computer programmers tell Hadoop what they want it to find, and the program maps out the job's tasks among a hierarchy of servers. Each server takes smaller and smaller pieces of the work, collecting the data, sending it up the chain and finally reducing it to return the requested results.
One query could occupy thousands of servers. "It's the ability to handle terabytes of data," Gallo said.
One Columbus-based company, Datacenter.BZ, offers thousands of servers for clients as well as 86,000 square feet of space for companies such as AT&T and Verizon to house thousands more. "It's a big, established and quickly growing industry," said Mike Scherer, a Datacenter.BZ co-owner. "You need the connectivity and network access."
Schroeck said IBM plans to use its own computing assets for its analytics center. He said IBM has signed with several clients, but he would not say who they are or what the center's first tasks will be.
He did say, however, is that there will be plenty of work.
"One of the key challenges organizations are facing right now is a lack of big-data analytical skills," he said. "They need people who understand Hadoop and analytics or sophisticated analytic techniques and also understand how to get that information."
Distributed by MCT Information Services
Most Popular Stories
- Hernandez lawyer: Pats Records Dispute Resolved
- Congress Leaving Town as Deadlines Loom
- Oregon Voters to Decide on Recreational Pot
- 4th Circuit Upholds Obamacare Subsidies
- Fiat, Renault Strike Deal on New Light Vehicle
- Oregon to Vote on Recreational Marijuana
- Senate, House Locking Horns on Border Funds
- A's Agree to 10-Year Lease to Stay in Oakland
- LinkedIn to Buy Ad Tech Company Bizo for $175 Million
- Jeter, Bauer Give Fox a Strong Week in TV Ratings