Canada data firm AIQ may face legal action in UK

The UK’s Information Commissioner is considering legal action against Canadian data firm AggregateIQ (AIQ).

It follows testimony from co-founder Jeff Silvester to Canadian MPs in which he claimed to be co-operating with the watchdog’s inquiry into Cambridge Analytica.

In a statement, commissioner Elizabeth Denham said this was not the case.

AIQ is implicated in a data privacy row involving Facebook and the political consultancy.

Ms Denham said: “The Canadian company Aggregate IQ has so far not answered the substance of our questions in the ICO’s investigation.

“In recent correspondence we were advised that the company would not answer any more questions from my office, stated it was not subject to our jurisdiction, and considered the matter closed. We are considering the legal steps available to obtain the information.”

In response, Mr Silvester said his firm had received two letters from the ICO, one in May 2017 and one in January 2018.

“We responded to both as fully as we were able to and in a prompt manner,” he told the BBC.

“We would prefer that the UK Information Commissioner simply contact us if she has new questions to ask. In fact we are waiting on the commissioner to respond to two letters we sent her office recently.”

Evidence to MPs

AIQ faces a range of other questions.

Ex-Cambridge Analytica employee Christopher Wylie alleges AIQ was given 40% of Vote Leave’s budget to create “digital and social targeting” for the Brexit referendum campaign.

He also alleges the Canadian firm had extremely close ties with Cambridge Analytica.

In a written statement to MPs who are investigating both firms, he claims AIQ was set up to solely to build online advertising technologies for Cambridge Analytica and its parent firm SCL.

Prior to this, he claims, AIQ had no clients.

In separate testimony to MPs, Facebook’s chief technology officer Mike Schroepfer said it had also found links between the firms.

Mr Silvester confirmed to the BBC that his firm did undertake work for SCL between 2013 and 2016 but disputed Mr Wylie’s claims that it was set up solely for that purpose.

Cambridge Analytica has denied it had “direct links” to AIQ, saying it was introduced to the firm by Mr Wylie.

“Zack Massingham, our chief executive officer, created AggregateIQ in 2011 for his personal political work. Zack and I then incorporated in 2013 to work together,” Mr Silvester told the BBC.

The firm also faces questions about political advertising it undertook for Vote Leave during the Brexit referendum.

During his testimony, Mr Schroepfer told MPs that AIQ had spent $2m (£1.6m) on Facebook advertising for the campaign.

He said he did not know where AIQ had acquired the data for this advertising but thought the information for its targeted ads came “from email lists” rather than from the app created by Dr Aleksandr Kogan, which is at the heart of the data scandal.

Mr Silvester told the BBC that his firm “did not use any improperly-obtained Facebook data”.

“The only personal information we use in our work is that which is provided to us by our clients for specific purposes. In doing so, we do our very best to comply with all applicable privacy laws in each jurisdiction where we work.”

The firm has, however, been suspended by Facebook.

Tracking down the data firm

The BBC recently visited Victoria where AIQ is based to try and find out a little more about the firm and its links to Brexit.

According to Google Maps, its address is listed as 501 Pandora Street, which is part of a trendy square, just off the waterfront in British Columbia’s capital city Victoria.

AIQ’s old office is up for lease and, according to the property management company “two guys had left hurriedly in January”.

Then the BBC received a tip-off that the firm had just recently relocated to an office two blocks away.

The third floor office, also home to a law firm and a software and engineering company, was almost entirely empty.

There were only two people working there and a dark-haired man who appeared to be in his early 30s came to the door but refused to speak to the BBC.

According to Mr Silvester the move in January “was scheduled for some time”.

He added that the firm currently has seven full-time employees, who are all software developers and online advertising specialists.

Reality Check: Was Facebook data’s value ‘literally nothing’?

There is a huge spectrum of opinion on the value of the Facebook data that Cambridge University academic Aleksandr Kogan gave to Cambridge Analytica’s parent company, SCL.

Dr Kogan told a parliamentary committee: “Given what we know now, nothing, literally nothing – the idea that this data is accurate I would say is scientifically ridiculous.”

On the other hand, there have been suggestions this sort of data will allow computers to gain a profound understanding of people and their preferences.

In a news conference on Tuesday, Cambridge Analytica’s spokesman said the company had also found Dr Kogan’s data set to be “virtually useless”.

The orthodox view among data scientists is that the use of social media data to target adverts on Facebook is in its infancy and not yet hugely effective – but Dr Kogan is going further than that, saying that it was completely without value.

Reality Check has seen Dr Kogan’s unpublished research into the value of predicted personalities for micro-targeting. We judge that he is underselling its value although he is correct to say that the data was not accurate.

Personality test

Let’s go back to where the data came from and what it included.

Dr Kogan had a personality testing app on Facebook, on which users would answer questions about themselves and be given scores on how they rated on the Big Five personality traits: openness, conscientiousness, extraversion, agreeableness and neuroticism, which are used by research psychologists and advertisers.

Dr Kogan says about 270,000 users took this test. Taking the test also gave the app data on all the users’ friends, which created a database of 30 million people and their predicted personality scores, according to Dr Kogan. Facebook puts the figure at up to 87 million.

These personality predictions are based on the idea that, for example, if it turned out that people who liked particular brands of sports cars and nightclubs had also turned out to be extraverts, then you might predict that other people who liked those things would also be extraverts.

You can see a similar sort of system on the website of the Psychometrics Centre at the University of Cambridge, which attempts to predict your personality test result based on your social media activity.

Inaccurate predictions

Dr Kogan’s research was funded by SCL, the research and communications company that formed Cambridge Analytica. Dr Kogan passed the data, including some of the pages that users had liked, to SCL.

Dr Kogan now says that the data he gave to SCL was useless for targeting adverts on Facebook because individual predictions were too inaccurate.

But some data scientists argue that the overall quality of the personality predictions is not the most important measure.

Part of the point of targeted advertising is to reduce costs by trying to appeal to only a relatively small number of users.

So you might be more interested in people turning up at the extremes of particular personality measures rather than those coming up as being close to average, because they are the ones most likely to exhibit the traits you are targeting.

As such, the overall reliability of the data may be less important than finding groups who may be targeted.

Also, Dr Kogan argues that trying to assess the personality of an individual gives too large a margin of error so the predictions are reliable only if you’re taking averages across larger groups. But looking at larger groups may be helpful during an election, when you might be trying to decide where to buy advertising on local radio or where to hold an election rally, for example.

So Dr Kogan is underselling the value of his dataset. While not all of it would have been useful, parts of it could have been helpful.

Read more from Reality Check

Send us your questions

Follow us on Twitter

Tech Tent: Questions for Zuckerberg and Cambridge

It was a two-day interrogation with dozens of questions – some of them acute, some of them rambling, a few quite bizarre.

On the Tech Tent podcast this week, we zero in on what Mark Zuckerberg failed to answer during his US congressional appearances, about just how much data Facebook collects – and the control users have over it.

We also try to find out whether something bad is going on at University of Cambridge when it comes to academic use of Facebook data, as Mr Zuckerberg suggested.

  • Stream or download the latest Tech Tent podcast
  • Listen live every Friday at 15.00 GMT on the BBC World Service

    The single most uncomfortable moment for Facebook’s founder was probably when Senator Dick Durbin asked him whether he would share with the committee the name of the hotel where he had spent the night in Washington.

    After a long pause and an embarrassed grin he answered “umm…no!”

    It made the point, according to Senator Durbin, that he was more cautious about his privacy than the average Facebook user who “checks in” without a thought.

    The following day, he was asked by Congressman Ben Lujan about the data collected on people who had never even signed up to Facebook. Again, Mr Zuckerberg appeared uncomfortable. He had never heard of the widely used term “shadow profiles” to describe this kind of data collection.

    Then the congressman took us down an Alice in Wonderland-style rabbit hole, where people who do not use Facebook are told to log in to their Facebook accounts to find out what data Facebook holds on them. “We’ve got to fix that,” he said.

    Frederike Kaltheuner from Privacy International tells Tech Tent that this kind of data collection, with users unaware of what is happening, is all too common – and Facebook is far from the only culprit.

    We also examine the issue raised by Mr Zuckerberg when he was asked whether he planned to sue either Dr Aleksandr Kogan or Cambridge University over the misuse of Facebook data.

    ‘Stronger action’

    He talked of a whole programme at the university, where a number of researchers were building similar apps to that made by Dr Kogan for Cambridge Analytica.

    “We do need to know whether there was something bad going on at Cambridge University overall that will require a stronger action from us,” he said.

    The university fired straight back. Mr Zuckerberg should have known that perfectly respectable academic research into social media had been going on, some of it with the involvement of Facebook employees. And as for Dr Kogan, the university had written to Facebook about its allegations against him but had not received a reply.

    On Wednesday morning, before Mr Zuckerberg’s remarks, I visited the Cambridge Psychometrics Centre and found some acknowledgement of the harm caused to the university’s reputation.

    The Centre, which is located in the Judge Business School, was drawn into the controversy when Facebook banned Cubeyou, another firm that had developed a personality quiz in collaboration with the university’s academics.

    Business development director Vesselin Popov insisted it was opt-in only and was in line with Facebook’s policies at the time, so was not at all like the app developed for Cambridge Analytica by Dr Kogan.

    He told me that Dr Kogan’s work had raised issues for the university: “Even if an academic does something – quote unquote in their ‘spare time’, with their own company – they still ought to be held to professional standards as a psychologist.”

    Dr Kogan and the Cambridge Psychometrics Centre are in dispute over whether a row over his personality app – and the involvement of the centre’s academics – was about ethics or money. I wrote another article about that issue on Friday.

    But the two sides agree that Facebook needs to focus on what commercial businesses do with user data, rather than academics.

    “It’s very clear that Cambridge Analytica and these kinds of companies are the product of an environment to which Facebook has contributed greatly,” says Mr Popov. “Although they might be making some changes today in response to public and regulatory pressure, this needs to be seen as an outcome of very permissive attitudes towards those companies.”

    With an audit of thousands of Facebook apps under way, we may hear more in the coming weeks about just how cavalier some companies have been with our personal data.

    • Stream or download the latest Tech Tent podcast
    • Listen live every Friday at 15.00 GMT on the BBC World Service

Facebook to vet UK political ads for May 2019 local elections

Facebook’s chief technology officer is to promise MPs that the social network will act to make political advertising far more transparent for UK users.

Mike Schroepfer will say that his firm will be ready to authorise ads in time for England and Northern Ireland’s May 2019 local elections.

He will make the pledge while giving evidence to a parliament committee.

Facebook had previously committed itself to similar action in the US later this year.

Mr Schroepfer is being questioned as part of the Department of Culture, Media and Sport Select Committee’s inquiry into fake news.

But the politicians also want to know more about the leak of Facebook data to the political consultancy Cambridge Analytica.

The committee had wanted to hear from Facebook’s founder and chief executive Mark Zuckerberg.

  • Facebook sales soar ‘despite challenges’
  • Was Facebook data’s value ‘literally nothing’?
  • Facebook threw us under bus, says data firm
  • ‘Facebook in PR crisis’ on data row

    However, he opted to send other executives to answer questions from politicians outside the US, having given two days of testimony in Washington earlier this month.

    Advert archive

    In his opening remarks, Mr Schroepfer will tell MPs that he and his boss are deeply sorry about what happened with Cambridge Analytica, which he will describe as a breach of trust.

    He will also promise to deploy a new “view ads” button in the UK by June 2018, which will let members see all the adverts any page is showing to users via Facebook, Messenger and Instagram. The company first launched the facility in Canada last October.

    In addition, Mr Schroepfer will promise the following will be up and running in time for the 2019 local elections:

    • political ads will only be allowed if they are submitted by authenticated accounts
    • such ads will be labelled as being “political” and it will be made clear who paid for them
    • the adverts will subsequently be placed in a searchable archive for seven years, where information will be provided about how many times they may have been seen and how much money was spent on them

      But MPs are likely to have questions about the use of Facebook in past elections, notably the EU referendum, and whether there was any foreign involvement.

      They will also want to drill down into the Cambridge Analytica affair and find out whether Facebook has uncovered similar cases during an audit of developer behaviour.

      View comments

Mark Zuckerberg’s dreaded homework assignments

Over two days, almost 10 hours.

If you watched every moment of Mark Zuckerberg’s testimony in front of Congress this week, you’ll know he rolled out one phrase an awful lot: “I’ll have my team get back to you.”

Now some of these were bits of data Mr Zuckerberg simply didn’t have to hand – such as why a specific advertisement for a political candidate in Michigan didn’t get approved.

Other follow ups, though, will require some hard graft from his team. What they produce could provide even more negative headlines for the company, as it is forced to divulge more of its inner workings than it has ever felt comfortable with.

Looking through the transcripts, I’ve counted more than 20 instances where Mr Zuckerberg promised to get back to representatives with more information. But these are the assignments I think could cause the company the most headaches – and provide some revealing answers.

1) Data on non-users

Set by: Congressman Ben Lujan (Democrat, New Mexico)

“You’ve said everyone controls their data, but you’re collecting data on people who are not even Facebook users who have never signed a consent, a privacy agreement.”

Dubbed “shadow” profiles, details of exactly what Facebook gathers on people who haven’t even signed up to the service has been always been a bit of mystery.

Even, apparently, to Mr Zuckerberg himself. He testified that he didn’t know the term, but acknowledged the firm did monitor non-users for “security” purposes.

Mr Zuckerberg promised to share more details on what data is gathered on people who don’t sign up for Facebook, as well as a full breakdown of how many data points it has on those who do.

In a related request, Mr Zuckerberg will provide details on how users are tracked (on all their devices) when they are logged out of Facebook.

2) Moving to opt-in, not opt-out

Set by: Congressman Frank Pallone (Democrat, New Jersey)

“I think you should make that commitment.”

Creating new regulation will be an arduous, flawed process. But one thing Facebook could do right now? Move to an opt-in model, one which requires users to decide to make something public, as is the default (and most popular) option for posting content now.

In a similar vein, Mr Zuckerberg was asked to get back to Congressman Frank Pallone on how the company might consider collecting less information on its users.

3) Repercussions for censorship mistakes

Set by: Congressman Steve Scalise (Republican, Louisiana)

“Was there a directive to put a bias in [the algorithms]? And, first, are you aware of this bias that many people have looked at and analysed and seen?”

One surprising admission made by Mr Zuckerberg before these hearings was that despite acknowledging the company made big mistakes, nobody has been fired over the Cambridge Analytica affair.

Representative Steve Scalise wants to take questions on accountability a step further.

In cases where Facebook reverses a decision to remove content – i.e. admitting it over-moderated – what kind of repercussions did those responsible face? If someone created an algorithm that unfairly filtered certain political views, was there any kind of punishment?

4) Specific rules for minors

Set by: Senator Ed Markey (Democrat, Massachusetts)

“We’re leaving these children to the most rapacious commercial predators in the country who will exploit these children unless we absolutely have a law on the books.”

On Facebook the minimum age of users is 13, not counting the company’s Messenger for Kids app (which doesn’t collect the type of data Facebook’s main app does).

But for those aged 13-18, or maybe 21, what happens in those oh-so-delicate years should be protected by tighter rules, Senator Ed Markey suggested.

Mr Zuckerberg said the idea “deserved a lot of discussion”, but maybe not a new law. He promised to get his team to “flesh out the details”.

5) How many ‘like’ and ‘share’ buttons are out there?

Set by: Congresswoman Debbie Dingell (Democrat, Michigan)

“It doesn’t matter whether you have a Facebook account. Through those tools, Facebook is able to collect information from all of us.”

It seems like everywhere you look there is a button prompting you to “like” or share things on Facebook – indeed, there’s one on the page you’re reading right now.

A request to at least estimate how many of Facebook’s buttons are out there might at first seem like an abstract demand – but the response could be quite something.

The “like” buttons enable Facebook to track users on pages that are not part of Facebook itself, providing more data for advertisers.

If it’s even possible to tot up how many buttons are out there on the web, expect a number in the hundreds of millions – that’s hundreds of millions of pages with which Facebook is tracking your activity beyond its own borders.

View comments

Cambridge University saw ‘no issue’ with Facebook research

The academic at the centre of Facebook’s data scandal has hit back at Mark Zuckerberg’s suggestion that “something bad” might be going on at Cambridge University.

Dr Aleksandr Kogan, who collected data for Cambridge Analytica, told the BBC that Facebook should be investigating commercial uses of its data, not focusing on academic research.

He also denied that fellow academics had had any “ethical issues” with his work for Cambridge Analytica.

On Wednesday, Mark Zuckerberg said at a congressional hearing that there were a number of Cambridge academics building similar apps to Dr Kogan’s.

He said Facebook needed to know “whether there was something bad going on at Cambridge University”.

Commercial purposes

In an email to the BBC, Dr Kogan said it was true that the Cambridge Psychometrics Centre had developed a personality quiz to collect Facebook data, and that the dataset was shared with academics around the world.

However, he added: “It’s surprising that Facebook would choose to focus its investigation on academics working with other academics. There are tens of thousands of apps [which] had access to the data for commercial purposes.

“I would have thought it makes the most sense to start there.”

On Wednesday, Cambridge University said it was surprised that Mr Zuckerberg had only recently become aware of its research into social media, since it had appeared in peer-reviewed journals.

It said Facebook had not responded to its request for information about the allegations against Dr Kogan.

‘Still representing university’

Dr Kogan also defended himself against criticism by the university’s Psychometric Centre, which said that even though he had never been connected with it, his commercial activities had reflected on the university as a whole.

Vesselin Popov, the business development director of the Psychometrics Centre, said: “Our opinion is that even if an academic does something ‘in their spare time’ with their own company, they still ought to be held to professional standards as a psychologist because, like it or not, they are still representing that body and the university in doing it.”

Dr Kogan said he was surprised by Mr Popov’s comments as he had discussions with academics at the centre about their participation in the project.

“In truth, the Psychometrics Centre never had an ethical issue with the project, as far as I’m aware. To the contrary, my impression was that they very much wanted to be a part of it,” he told the BBC.

He said the relationship went sour only after a dispute over how much the Psychometric Centre would be paid for its involvement in the project, not over any ethical concerns.

The Psychometrics Centre, which is based at the university’s Judge Business School, rejected Dr Kogan’s version of events.

It said it had complained to the university authorities about his behaviour towards two of its academic staff, not about the monetary issue.

Cambridge University says it has received reassurances from Dr Kogan about his business interests but is now conducting a wide-ranging review of the case.

Mark Sorryberg 1, Congress 0 – for now

Many of us could probably lay claim to a split personality, but few people are as blatant about it as Mark Zuckerberg.

Facebook doesn’t have one CEO – it has two.

There’s Mark Zuckerberg, the Ultimate Millennial. He wears t-shirt and jeans, is a Harvard dropout, happiest in New York and San Francisco, who talks a good game about connecting the world. He’s an engineer and geek who built perhaps the most remarkable network in human history, innovating his way to astronomical wealth. This guy is shy, but has a public persona that accommodates it.

Then there’s a chap I call Mark Sorryberg – the Big Tech Villain. He wears an ill-fitting suit, squirms when in Washington, is blamed for damaging all we hold dear – from rigging elections (“He’s killing democracy”!) to promoting extremism (“He’s unweaving society”!) and not paying enough tax (“He’s screwing the poor”!). This guy is so shy he comes across as awkward and uncomfortable when he should be projecting authority.

As the excellent Zeynep Tufekci wrote in an entertaining blast for Wired, we’ve seen a lot of this second character since the company was founded. In fact, over the past fourteen years, “sorry” seems to have been the easiest word for Facebook’s leader.

In 2006, after the launch of News Feed annoyed users, Sorryberg wrote in a blog: “This was a big mistake on our part, and I’m sorry for it.” In 2007, failures in the Beacon advertising system prompted another grovelling blog: “We simply did a bad job… and I apologise for it.” As Tufekci notes, by 2008, all of his blogs for Facebook were in effect apologies, and we saw several other examples even before he said about the Cambridge Analytica leak to CNN: “I’m really sorry this happened”.

So Mark Sorryberg is a familiar figure by now. He was on display in Washington this week, following the biggest crisis in the history of his company. There were several open goals in front of his interrogators, and opportunities to make him squirm and wriggle were not in short supply.

Yet, for the most part, they missed. After nearly 10 hours of grilling, Facebook is – for now – a richer company, Zuckerberg’s authority as CEO is re-asserted, and the potential disaster this week might have been was averted. These are all very short-term interpretations. There could be big trouble ahead. But the lawmakers fluffed it.

Ineffective questioning

The format didn’t help. For non-partisan reasons that are laudable in principle but ludicrous in practice, each lawmaker was given a maximum of 5 minutes on Tuesday and 4 minutes on Wednesday. You simply cannot build pressure, interrogate answers, or pursue a line of inquiry in the way necessary over such a short time.

But the representatives didn’t help themselves. In his allotted time, Senator Roy Blunt first told a boring story about his business cards, then gave a shout out to his 13 year-old son Charlie who is “dedicated to Instagram… [and] he’d want to be sure that I mentioned that while I was here”, which was sweet.

I’ve transcribed what followed.

Blunt: “Do you collect user data through cross-device tracking.”

Sorryberg: “Er, Senator, I believe we do link people’s accounts between devices in order to make sure that their Facebook and Instagram and their other experiences can be synced between devices.

Blunt: “And that would also include off-line data? Data that is tracking, that is not necessarily linked to Facebook but linked to one… some device they went through Facebook on?”

Sorryberg: “Senator, I want to make sure we get this right. So I want to have my team follow up with you on that afterwards.

Blunt: “That doesn’t seem that complicated to me. You understand this better than I do. But maybe you can explain to me why that’s complicated. Do you track devices that an individual who uses Facebook…has… that is connected to the device they use for their Facebook connection but not necessarily connected to Facebook?”

Sorryberg: “I’m not, I’m not sure [of] the answer to that question.”

Blunt: “Really.”

Sorryberg: “Yes”.

A work of literature that penultimate question was not. I don’t understand it, Zuckerberg didn’t understand it – and Blunt definitely didn’t understand it. He seemed poorly briefed, despite the gravity of the occasion.

Unfortunately, it was emblematic of the meandering, ineffective mode that dominated Tuesday. Wednesday’s interrogation was better, but still not as good as it should have been, not least because there were many questions that weren’t asked. The Facebook CEO’s weaknesses weren’t really exploited.

For instance, he should have been pushed harder on how hard it is to retrieve data that has fallen into the wrong hands. He should have been pushed harder about Facebook’s reaction to news that The Observer newspaper was publishing a story on the subject. His claims that something awry may have been going on at Cambridge University – vigorously denied by the institution – deserved more of a probing.

And the long history of errors at the company, plus its initial denial that there had been a data “breach” when it came to Cambridge Analytica, were worthy of a mauling that was never heard.

First do no harm

Given the scale of the recent controversy, and the courtroom theatrics of these cross-examinations, there was much to fear for Facebook this week. It was Zuckerberg’s first time getting grilled by the Senate and Congress, and his awkwardness in such public arenas was clear for all to see.

His facial expressions garnered comment on social media – but it was his body language and garb, over which he exercises more control, that struck me. On Tuesday, Zuckerberg’s tie knot was chunky and loose; and his halting responses and nervous smiles didn’t project much authority.

But he maintained his composure and politeness throughout. Investors gave a clear enough verdict: the two days added $26bn, or 6 per cent, to the company’s value.

There is some gridlock in Congress, and America’s politicians have a range of very big problems on their plate. That means that for the time being, the regulatory threat to Facebook – though of course they would say they welcome the chance to work with regulators – comes from Brussels and GDPR, rather than Washington.

In terms of new law or regulation, the question is: what kind? One of the great intellectual challenges in this field is in devising regulations that can keep pace with technological innovation: a very hard task. It is wrong to think, for instance, that you can just import the kind of regulation that Ofcom do for broadcasters, and apply it to video content on social media platforms.

The interrogation to come

While this week has not been the disaster for Facebook that many anticipated, and some wanted, the medium-term threats certainly haven’t gone away. And events of recent months have fundamentally changed the level of scrutiny the company is getting, while making perhaps hundreds of millions of users aware of the trade-off between their free use of Facebook and the digital footprint they leave behind.

As my esteemed colleague Dave Lee has noted, there are plenty of deferred questions that the CEO and his team will need to address. And the demands of British regulators that he gives evidence here, too, won’t quieten any time soon.

In particular, perhaps Congress members who realise this week was a missed opportunity will invite their guest back to clarify several of the points he made. If they are smart, they should see this as the beginning of a process, rather than the end.

But in adopting his apologetic posture with an efficacy his interrogators sadly lacked, Mark Sorryberg got one over America’s lawmakers when they should have scored an easy win. If he came to Britain, he wouldn’t get such an easy ride – which is the main reason he probably won’t.

‘More than 600 apps had access to my iPhone data’

While Facebook desperately tightens controls over how third parties access its users’ data – trying to mend its damaged reputation – attention is focusing on the wider issue of data harvesting and the threat it poses to our personal privacy.

Data harvesting is a multibillion dollar industry and the sobering truth is that you may never know just how much data companies hold about you, or how to delete it.

That’s the startling conclusion drawn by some privacy campaigners and technology companies.

“Thousands of companies are in the business of harvesting your data and tracking your online behaviour,” says Frederike Kaltheuner, data programme lead for lobby group Privacy International.

“It’s a global business. And not just online, but offline, too, via loyalty cards and wi-fi tracking of your mobile. It’s almost impossible to know what’s happening to your data.”

The really big data brokers – firms such as Acxiom, Experian, Quantium, Corelogic, eBureau, ID Analytics – can hold as many as 3,000 data points on every consumer, says the US Federal Trade Commission.

Ms Kaltheuner says more than 600 apps have had access to her iPhone data over the last six years. So she’s taken on the onerous task of finding out exactly what these apps know about her.

“It could take a year,” she says, because it involves poring over every privacy policy then contacting the app provider to ask them. And not taking “no” for an answer.

Not only is it difficult to know what data is out there, it is also difficult to know how accurate it is.

“They got my income totally wrong, they got my marital status wrong,” says Pamela Dixon, executive director of the World Privacy Forum, another privacy rights lobby group.

She was examining her record with one of the merchants that scoop up and sell data on individuals around the globe.

She found herself listed as a computer enthusiast – “which is a bit annoying, I’m not running around buying computers every day” – and as a runner, though she’s a cyclist.

Susan Bidel, senior analyst at Forrester Research in New York, who covers data brokers, says a common belief in the industry is that only “50% of this data is accurate”.

So why does any of this matter?

Because this “ridiculous marketing data”, as Ms Dixon calls it, is now determining life chances.

Consumer data – our likes, dislikes, buying behaviour, income level, leisure pursuits, personalities and so on – certainly helps brands target their advertising dollars more effectively.

But its main use “is to reduce risk of one kind or another, not to target ads,” believes John Deighton, a professor at Harvard Business School who writes on the industry.

We’re all given credit scores these days.

If the information flatters you, your credit cards and mortgages will be much cheaper, and you will pass employment background checks more easily, says Prof Deighton.

But these scores may not only be inaccurate, they may be discriminatory, hiding information about race, marital status, and religion, says Ms Dixon.

“An individual may never realize that he or she did not receive an interview, job, discount, premium, coupon, or opportunity due to a low score,” the World Privacy Forum concludes in a report.

Collecting consumer data has been going on for as long as companies have been trying to sell us stuff.

As far back as 1841, Dun & Bradstreet collected credit information and gossip on possible credit-seekers. In the 1970s, list brokers offered magnetic tapes containing data on a bewildering array of groups: holders of fishing licences, magazine subscribers, or people likely to inherit wealth.

But nowadays, the sheer scale of online data has swamped the traditional offline census and voter registration data.

Much of this data is aggregated and anonymised, but much of it isn’t. And many of us have little or no idea how much data we’re sharing, often because we agree to online terms and conditions without reading them. Perhaps understandably.

Two researchers at Carnegie Mellon University in the US worked out that if you were to read every privacy policy you came across online, it would take you 76 days, reading eight hours a day.

And anyway, having to do this “shouldn’t be a citizen’s job”, argues Frederike Kaltheuner, “Companies should have to protect our data as a default.”

Rashmi Knowles from security firm RSA points out that it’s not just data harvesters and advertisers who are in the market for our data.

  • How to protect your Facebook data
  • Facebook suspends Brexit data firm
  • Facebook to warn users in data scandal

    “Often hackers can answer your security question answers – things like date of birth, mother’s maiden name, and so on – because you have shared this information in the public domain,” she says.

    “You would be amazed how easy it is to piece together a fairly accurate profile from just a few snippets of information, and this information can be used for identity theft.”

    So how can we take control of our data?  

    There are ways we can restrict the amount of data we share with third parties – changing browser settings to block cookies, for example, using ad-blocking software, browsing “incognito” or using virtual private networks.

    More Technology of Business

    • Meet the gargantuan air freighter that looks like a whale
    • Reaping the wind with the biggest turbines ever made
    • Making deliveries in a badly mapped world
    • Meet the female ‘artpreneur’ making a splash online
    • Virtual reality as sharp as the human eye can see?

      And search engines like DuckDuckGo limit the amount of information they reveal to online tracking systems.

      But StJohn Deakins, founder and chief executive of marketing firm CitizenMe, believes consumers should be given the ability to control and monetise their data.

      On his app, consumers take personality tests and quizzes voluntarily, then share that data anonymously with brands looking to buy more accurate marketing data to inform their advertising campaigns.

      “Your data is much more compelling and valuable if it comes from you willingly in real time. You can outcompete the data brokers,” he says.

      “Some of our 80,000 users around the world are making £8 a month or donating any money earned to charities,” says Mr Deakins.

      Brands – from German car makers to big retailers – are looking to source data “in an ethical way”, he says.

      “We need to make the marketplace for data much more transparent.”

      • Follow Technology of Business editor Matthew Wall on Twitter and Facebook
        • Click here for more Technology of Business features