We are focused primarily on producing hyperlocalized real time air pollution data - we’re fundamentally a platform company, a data analysis company. In most cases though, we have to create the infrastructure as well in order to get a lot of the data. AirSensa started out of my background in the IT industry as a trend watcher and a strategy consultant.
Indeed. My very first year at university in 1981 in London was designated in the UK by the technology papers as the Year of Artificial Intelligence. Not so much as it turned out, it’s a little bit more effective now! It’s been a fascinating technology through my whole career. As part of my consulting company, I set up a Future Cities & Sustainability division about 13 years ago.
We helped IBM put on a big conference in London in 2010. There are lots of conferences, and you very rarely hear things that surprise you, but I did then - it was when I learned about the impact of air pollution; that was September 2010.
People had just about discovered it - a little bit. We looked at the issue, myself and my CTO at the time, and it became very apparent that actually nobody really measures air pollution usefully anywhere in the world. It’s measured for reporting purposes, modelled ambient averages, reported in retrospect, but nothing useful and actionable.
Yes. We thought, there’s an expression in English, I’m sure you’ve heard it, just "how hard can it be?" <laughs> It’s very hard, as it turns out, starting to do this. But we set about doing it, and we realized it was going to be essentially largely a software problem. Very large-scale IoT, supporting complex sensors, and so on.
We knew how to do software. We’d done that quite a bit before, so we set out to do this. I set up a nonprofit to do it, because there wasn’t really any commercial model that made any sense around it. It’s a little bit worrying, because nonprofits tend to rely on donations, public money, grants, and so on. Those run out.
If you want to create something which exists for a long time, it’s much better to be economically viable, but there wasn’t really an identified economic model around that at the time. Not enough people were talking about pollution.
Exactly. We started to think about developing it. We started piloting in late 2015. Around about 18 months, we had 130 units out mostly in London, various other places. About halfway through that pilot, the VW scandal broke. Now, everybody’s talking about pollution, and they’ve continued talking about it, which is a good outcome.
More and more people are aware around the world of the large-scale health impact that it has. As a result of that, about 15 months ago, we flipped into a commercial organization as we identified a viable commercial model around it, which then meant we could actually start rolling out infastructure, keeping it viable for the long term.
That’s where we are. We’re the world’s oldest startup, [laughs] in that sense. We’re now in scale-up, and are building meshes in the UK, in the Channel Islands, the Island of Jersey. I don’t know if you’ve ever been there, but it’s an island between the UK and France. They have a very interesting approach there. It’s a digital sandbox. They have a piece of government called Digital Jersey, which encourages people to come in and trial things. It’d be well worth having a look at that, and the approach they take.
I’m just really, really interested. I know how hard it is to do these things, because we’ve been doing it so I was interested in that. I’m also interested in the wider sense of what you’re actually doing around radical transparency and g0v.
I’m really interested in this approach of open government. I know you’ve only been doing this for three years as part of the government. How far do you think it can go? The openness is very invigorating. It’s very attractive.
We’re pushing the boundaries to find out exactly how far can it go in each and every data sets. It varies. For procurement data, the main worry of the procurement agency is that it will be used out of context as a trade negotiation kind of weapon, which creates an unfair trading disadvantage as long as our trade partners do not open their procurement data, but we do in a real-time fashion. That creates a disadvantage at trade negotiations, obviously. That is really the only thing against the totally opening up of procurement data for any purpose, which is why we elected a model of basically saying the data is public, but not under a open license.
People can ask for a copy of the data if they are either a Taiwanese national promising not to use it for encouraging foreign trade negotiation purposes or that you have a legitimate reason. For example, you get statistics in the aggregate and you can, of course, give a algorithm to a person or entity trusted with this data, which runs your code and published the results back to you.
In that case, that would be open data. This method is called open algorithm, like public data, but only for nationals, on one hand, and then open algorithm for everybody. That is currently the balance we’re striking with the government procurement data.
We’re also actively campaigning for all the governments to adopt the open programming model. Once they do, there will be no excuse for us not to open it completely, but it does take coordinated action. We’re seeing very much the same for beneficial ownership data of the registries. After the Panama Papers, people really want to know [laughs] where the money is flowing between the jurisdictions.
Again, the jurisdictions that are transparent are pressuring other jurisdictions that are not transparent with data. Then you can do a naming and faming [laughs] to praise the government that is already doing so in the hope that other governments would follow.
All this is part of this larger partnership with the Open Government Partnership, or the OGP, which is like-minded jurisdictions, civil society, people in the social sector, and also private sector contributing to the data.
Key technologies like the mesh network you mentioned, distributed ledgers, and so on, make sure that people can trust each other on their data without the capability of any party to overwrite, or to exploit, to tamper with the data for private purposes. I think a data ecosystem is very important. Once we have this ecosystem, we can prove to more people that it actually makes sense to do so.
We do it through the like the Presidential Hackathon, which you have heard about, the president giving out trophies, the promise that a three-month prototype become public policy as the award and things like that. We’re integrating this kind of open government not as a way just for the government to publish data, but rather for the citizen data to also influence the government. It’s a bi-directional thing.
A lot of that is resonant. How much do you think that Taiwan’s history and political environment, with China and so on, how much do you think that has amplified what you’re doing? Therefore, how transportable do you think that is to other countries and jurisdictions?
This is a great question. What we have found is that the fruits of our invention basically made it less risk for the public service to engage with the public. Previously, you had to pick up 5,000 phones and explain the same thing over again. Now, by scalable listening, you can just publish one response once and have people cross-moderating each other, up-voting the most relevant questions.
You can laser-focus your response to either emerging pressing questions or even emerging disinformation campaigns. It really saves the government, public servants time. While the pressure, as you said, to innovate is very high because we’re caught between a very authoritarian party that really tries to influence the common society so it becomes authoritarian.
Their job is just to export authoritarianism here, our internal, domestic will to remain absolutely free and open even in the light of such authoritarian forces. We really pride ourself and define ourself in terms of being a free and open society.
That prompts those social innovations that can address those pressing issues without over-crossing the time or the freedom of the society, so the time of public servants and the freedom of the society.
Once we deliver such solutions, such as social sector cross-checked, fact-checking bots, for example, that is not controlled by the government, the government is just giving rapid responses, that sometimes are funny, 60 minutes after each disinformation campaign, for example. This can then be exported at relatively low cost to even semi-democratic countries.
Basically, we absorb the risk and the initial cost of research because of the social pressure that presses us to do so. Once it stabilizes on something that people can all live with, that tend to be very exportable.
...like that. How does that mesh with what’s going on in that sphere in terms of personal data and making enormous businesses out of mining people’s personal data? Is there a crossover there from your perspective?
Yeah, very much so. There’s a few things in Taiwan that’s worth highlighting. First, our largest public forum, like Reddit, is actually open source and maintained by a team of students in National Taiwan University. That has a markedly different dynamic when it comes to being agile, being adaptive and respectful for personal data.
Being in an academic institution with the system operators rotating on a four-year or a six-year basis, they really have no incentive to hoard personal data from the users. That is to say, it is basically a social production. It’s open source, anyway.
It’s a social production system that has all the hallmarks of the large site with its own moderation, its own community rules, and things like that, but it’s not being hijacked for either sell to advertisers -- they don’t need advertisement money -- or selling to the state. They’re independent, because of their academic status, from the state.
They are a hallmark of what we call, in the UK, the voluntary sector, in Taiwan, we call the social sector. That is separate from the private or the public sectors’ aims. That provides a safe space for new collusions to happen in that space. The second thing is that Taiwan uses a EU-style privacy law from the very beginning. Ever since we have a personal data protection law, we are using the EU style.
People around our region are slowly converging, mostly because of GDPR, to that point of view, but it’s not ingrained in the culture yet. Here in Thailand, I think they’re taking a year just getting people’s head around data not as an asset but as a beginning of a relationship.
It’s not something to be sold, but rather a fiduciary duty. The data operator need to prove and earn the trust by acting in the best interests of the data domains, the data commerce, and things like that. That is where we have a head start compared to nearby jurisdictions.
The long-term movement from a goods-related economy to a service economy, and finally, if you like, almost a data-based economy, that sort of approach...I’ll try to think of a non-combative way of putting it. It’s quite destructive to value, and why I’m not a big supporter of Facebook as well.
When it becomes impossible not to. Where you’re seeing large amounts of economies start to be based on the value of data, how far can that openness go? Value has to be created somewhere. How does that reconcile?
As a long-time worker in the free software movement, we had a lot of prior auths, basically saying the infrastructure that everybody uses, that is the commerce that saves everybody’s costs. That should be common ownership. New innovations where the individual bears risk delivering that, that could be, for a time, private.
That’s the whole idea of copyright systems or patent systems in the first place. Even the copyright holders, which theoretically enjoys practically indefinite ownership, sometime find it’s nice to basically release them into the public domain under certain Creative Commons arrangements. They find that it’s in their longer-term interests to do so.
Maybe it’s more discoverable. Maybe it makes partnership more possible. Maybe they just feel that it serves as a free advertisement, just as the game software developer, ID Software release every previous generation of their gaming engines open to increase the state of the art, and also make, I’m sure, hiring easier [laughs] and things like that.
They can act out of self-interest to decide to be part of the commons out of not altruism, but rather, a calculated long-term thinking. It doesn’t work if everybody just looks at the quarterly reports.
If you only look at your next quarter -- of course, everybody’s encouraged to be "patentuers" -- it needs the society to coach the company to take a more long-term thinking and the public sector to encourage more long-term thinking in terms of our investment policies, social financing policies, and things like that.
We encourage the company to become like B Corps, to declare their explicit purposes in addition to profit, and give them recognition and rewards for doing so. All this is just striking a balance. I’m also not arguing communism. [laughs] I’m saying that...
It is a relief. I’m not arguing communism, but I think this commons distance between the entirely private and entirely communal has a lot of room to grow, even in the face of Facebook and so on. GDPR, with its right to portability, may be the first step. The next step could be, for example, defining a ongoing data relationship as something that could be compensated.
Previously, it’s hard because the transaction cost is just too high. Now, it’s actually becoming possible. We’re seeing more federated social networks anyway. This is just some of the possibilities to strike a balance between the two polar opposites.
Obviously, one of the core underlying issues that we have to deal with in what we do is the responsibility, if you like, of data validity. We clearly have a profit motive now because we’re a commercial organization.
Not only to be a good steward, but at the core of this, and this all started off several years ago, has been just about trying to help people avoid pollution and, therefore, improve their health. It’s about healthier society for the individual. That’s very, very new.
One of the things that we’re constantly combating in the discussions we have with governments and others is around data validity. It goes from the simple, which is if what you’re doing is giving people free data, which is essentially part of our model, which is free apps for everybody, citizens aware of where we’re going...
If you’re giving people that data and you’re specifically doing that because you want to not only help generally improve health but certainly very vulnerable groups like asthmatics, cardiovascular patients, then it behooves you to ensure that you are not scaremongering. You’re not giving them the incorrect data where you give them high alerts, and you have to take some actions to avoid...
Sorry to interrupt. In the g0v website Airmap, there is a prominent notice, even before clicking to the visualization, that says, "Please understand that this is crowdsourced data, that that may or may not be entirely accurate," and, "Please do not panic when you see a spike." Then you have to click, "I’m not panicking" to enter the website.
There’s two key questions I have, really. One was around that. The way we combat that, essentially, and we know very well the foibles and the interest of using low-cost sensors - we fixed, we think, those issues, largely in software, with a four-layer calibration model...
...yes there’s some basic stuff like temperature and humidity effects. But there’s also a number of other things around drift and bias. There’s all of those things you have to combat, which we’ve pretty much fixed. We also work, wherever we go, with universities and research institutes to ensure that we’re contextualizing that. We have an approach to doing that, but I didn’t know how that worked in your context.
I don’t mean this in any critical way at all, but I’m very sensitive to this [laughs] because I’ve been living it for so long. I’d love to know if there’s a way of really doing that. Obviously, from our perspective, a big cost of this is infrastructure costs. If there’s any way in the future that that could be defrayed or socialized, if you like. I shouldn’t use that word.
It would speed up the whole process. I’m just not sure know how we get there. You’ve answered the question already by saying, "Look, this is crowdsourced data." I find it very scary, this crowdsourced data thing, in this space.
I know. There’s two things that we’re actively doing. The first thing is that there is a copyright snapshotted through distributed ledgers in the high-speed computing center. We have the top 20 supercomputers in the world, Taiwania 2, in charge of doing whatever algorithms that people have thought of in operating in in-place data. They don’t have to download anything.
They just upload their model in a Docker or something container, and then it runs through the data. That basically makes it scientific. Previously, it went to different professors’ predictive and error models. We don’t know whether they’re better because it’s a more accurate model or whether they use more accurate data because they don’t share entirely the same data.
Now, because everything is aggregated in the so-called civil IoT computation platform and equalized using the International Center like sensor things and so on, people can all very easily see that it’s the models that are competing with each other, not the data sources. That’s the first thing that we do, basically establishing a data commons.
Taiwan is a excellent practice in fusion of constitutions. In any case, the Academia Sinica has a dedicated team that calibrates the crowdsourced data. All the research they do is exactly the calibration that you just mentioned. They basically work with the public sector. We also set up Airbox-compatible sensors throughout Taiwan to serve as a government contribution to their community effort.
Using those different correlated data, the people in Academia Sinica are working on algorithms to make it more reliable and exactly solving the drifts and biases. They do publish on that. Because they’re above the administration, people trust that the Academia Sinica is not working for any party’s team. They’re superior, beyond the normal universities in that case.
It’s good to have an arbiter when it comes to this kind of thing. Of course, there are occasionally strong questions of whether they cause public panic by the MPs. By and large, the academic communities including researchers of Academia Sinica to basically work with processed data.
It is a thorny political problem, but we got around to it by saying the president and legislative has approved special budget just for this and just for getting the data in the commons and calibrating the data. It’s our national direction. We cannot beat the citizen scientists. We must join them. That was the conversation we had in the couple two years. It’s pretty stable now.
That would be great. At what stage do you think, or maybe you already are at that stage, can this sort of thing specifically drive policy? At what stage do you say, "Yeah, this is good enough now to actually push that through."?
From our perspective, coming from the background we do and coming from a very data-based approach, rather than an infrastructure approach, we’ve been talking about big data for years and years, and it’s great. We collect more and more data at the moment, but 99 percent of it still isn’t used for anything at all.
We’re very clear about how the value of this rolls through into not only helping directly people, the citizens, through the apps, but it also helps government policy. You have to get to a stage where you can stand there and say, "This is good enough to now base policy on" Are you at that stage yet?
There’s municipal level and even township level of public administration. Of course, there’s the national level. In the township level, the value is apparent. If somebody burns rice straw, for example, it takes a day for the satellite image to conclusively prove it, by the time the rice straw burnings is done. Using crowdsourced air sensors, rice-straw burning can be detected in a matter of minutes.
Then the people who audit them can fly a drone over or something. That is immediate value. It’s not really changing any policy. It’s basically saying, "When we say don’t burn rice straws, we actually have a way to..." It’s like community, best practice standards around safety. Those sensors will automatically record. Of course, we can’t fine people just by those sensors alone, otherwise, tampering will be very lucrative.
It’s just an early sign to send people over, to reduce the time it takes to detect and also for the auditors to geolocate the precise position of rice-straw burning or other actions that harms the general health. This is already ongoing for a year or so now. It’s a no-brainer.
On the municipal level, I think it is also helping to clarify. For example, the people will often suspect a industrial park or so on of causing the air pollution, and they cannot talk themself out of it without showing the numbers.
If it is the private sector provide the number themselves without any correlations by other sectors, then people will say, "Maybe you fake your records. Maybe you don’t fake your records, but they are just conveniently intervaled," [laughs] or things like that. There really is no way for the private sector to prove on its own.
Now, they can say, "Oh, look, there is public sector line posts, Airboxes corroborating, and we don’t control the sensor or the Airbox." They are social sectors. When all the numbers match, they can be more relieved and ameliorate against the PR disasters around the accusations that they somehow polluted the traffic of air, while it’s actually the motorcycles or whatever. It’s more evidence-based.
That’s the municipal level. In the central government level, what we’re ultimately trying to do is basically to export this model to not just inform air but also water, so that we can more plan with confidence the land use for light industry, for heavy industry, for reforming agriculture, and things like that.
Using the common evidence, people can more be accepting about the effects of, for example, a simple second-level production plant on the local plants, which reduce a carbon footprint that people may worry it may pollute the water and things like that.
When it’s participatory, when people can collectively monitor it, then people who want to upgrade their agricultural businesses have more leverage in proving that they can still grow organic farm nearby, or things like that. Basically, it’s a way for citizen deliberation for land use planning.
It really is interesting the way the approach can take off. You’re saying all the same things that I say to people, except we don’t have a structured agreement the way you do. Maybe you haven’t seen this at the level that you are, but have you seen any pushback from the vested interests that generally control this sort of area of work.
There are, and not necessarily for Airbox project. For other projects, for example, when we released the likelihood for the lands to be affected by the sliding rocks, or the instability of the mountains nearby it, that did cause a lot of pushback. People worried that the price of the land, of the buildings, would decrease.
Because the government released it in a resolution of maybe a block, everybody in that block gets effected, whereas only the far east side of that block gets effected. There’s actually a backlash against releasing that data. It’s just government data. It’s not even citizen data. A solution to that, of course, is for the municipal level to release it in a very fine grant.
They have to spend money for it, but they did release it in a higher resolution, so the two buildings that are effected can know for sure they really are effected, but it will not impact the land price of everybody else. We do see pushbacks, but it’s usually soft with better, higher resolution rate, not by withdrawing from releasing data.
If your algorithms can somehow have a demo and running in our super-computing cluster to show that for some cases it’s actually better than other prediction models and things like that, I think that will win you goodwill from the entire community across sectors. We really...
Then you don’t have to do any mesh building for that kind of PR. [laughs] That will pave the way to more in-depth partnership, where it could be the supply chain side. It could also be a deployment side, and we’re all very happy to help.
I haven’t gotten that far in thinking about that. I was really fascinated by what you were doing here. We’re having to look at mesh building, largely as a kind of structural debt finance thing, but it’s difficult. I can’t say it in more detail, but we have to do this work because the whole citizen science approach has had a long history, particularly in air quality, of producing horrific results.
I can see why the approach of having the academic backup is important. I can see how that’s essentially part of what we’re bringing to it as a company. Even so, the problems like you don’t know exactly where the sensors are mounted, or how high they are, or etc. etc. all those things.
I know. I think the first municipality that introduced in Taiwan Airbox using public budget is the Taipei Project Management Office of Smart City. They did it in a very smart way. They’re actually here in TechSauce too.
They do this by introducing it as an education tool. They are not talking to economic development or to land use planning because they know that there will be pushbacks. The primary school teachers and junior high school teachers, they love this kind of tech kits that teaches data stewardship.
Unless you really own something that constantly produce data, this whole GDPR thing makes no sense to students. Once you’re a data operator, all of this would start to click, like the studies of the data steward becomes part of the education. The education sector loves it.
They are government sponsored anyway, and so basically just roll it out. It’s really popular with the families, and so on. Many of the families with them buy some more units from Edimax, which thankfully is a business model. That has been the replicated in other municipalities. Maybe the education sector is also one.
I’m looking to expand that as fast as possible. It achieves those objectives we originally had. It is a fantastically important thing to do. We do find that where teachers are interested in engaging in it, they’re very passionate about it, and the students tend to be really interested across the board.