Sunbird AI sets out to help curb noise pollution in Kampala

The prevalent exposure of Ugandans to noise pollution persists and continues to be unabated because of failures in the monitoring and control framework in the country.

Yet, there is evidence on the relationship between noise pollution and its effects on human health, and the general environment. While KCCA and NEMA had made efforts to control noise pollution within the capital city and courts have weighed in on the issue, there is a need to approach the problem with more innovative and effective ways. These should empower citizens to be at the forefront of noise monitoring and control using simple technology tools.

Further, it is Sunbird AI’s belief that agencies mandated to control and monitor noise pollution should be empowered with technological tools that enable response and enforcement in more efficient ways.

Against this background Sunbird AI, an artificial intelligence firm/company has developed tools to support monitoring, and enforcement of noise pollution controls in Uganda using artificial intelligence. Sunbird AI envisages that these tools will empower the public to be vigilant actors in detecting and reporting noise pollution.

Ten noise collection agents have been dispatched to 66 parishes within the 5 divisions of Kampala city. An additional 100 will be dispatched at the end of the month of May 2021. These agents will enable assessments on general levels of exposure and provide the requisite data regulatory agencies need for decision-making on generating best practices.

Ernest Mwebaze, Sunbird AI’s Director, shared that, ‘identifying areas of high noise pressure is a key element for an effective environmental management and for mitigating impacts, identifying noise hotspots and areas of potential conflicts helps gather baseline knowledge on noise-producing human activities and mapping these areas.’

‘Currently, our focus is limited to more urban and industrialized towns because they have more population and so more human activities going on that are highly considered to be responsible for the high noise production. For example, in Kampala and Wakiso, owing to the level of industrialization, the population, and traffic dynamics (road, air, and railway), the noise pollution continues to increase, unabated,’ Ernest added.

Sunbird AI’s Lydia Sanyu training the noise collection agents on how to use the artificial intelligence to capture noise levels

Equipping citizens to detect, report, and control noise pollution would go a long way in empowering Ugandan citizens to participate and be part of decision-making on a critical issue that affects their lives and health. For the general public to be involved in the regulation of noise pollution and requires the necessary technology that will help them sense and measure their personal exposure to noise in their everyday environments. With Sunbird AI artificial intelligence technology, this is now possible.

Although KCCA and NEMA monitor and control noise pollution, there is a paucity of data and trends documented, and it is reported that there are no established systems to manage and track noise pollution data. This poses challenges and risks in noise pollution monitoring and designing mitigation mechanisms. Sunbird AI is keen on supporting KCCA and NEMA with artificial intelligence to manage and track noise pollution data.

Sunbird AI 2020 Annual Report

Good news: the 2020 Sunbird AI annual report is out.

Our annual report contains the work we did as an organization last year. Considering that 2020 was dominated by a global pandemic with lockdowns, curfews and stay-at-home orders among other things, we started out as a completely remote team. In the absence of the ability to move about physically to coordinate our initially planned projects, we turned to the work we could do at that moment: aiding in the analysis of COVID-related data, on social media and on radio.

We also began research and implementation of AI language technology for five Ugandan languages, starting with a dataset of translations from English to these languages.

You can read more about these projects and about our organisation by downloading the report here: Sunbird 2020 Annual Report.

Happy reading!

COVID-19 Analysis (March 2021)

March 2021 has been a special month: it marks a year since the world began to see the effects of a rapidly spreading global pandemic. From freezing air travel to closing schools to lockdowns to curfews, the pandemic began to change the way we lived our lives. Worse still were the hospitalizations and deaths of so many people, the losses among our families and friends.

One year later, we are still living through these issues in some form. The rollout of vaccinations offers some hope, but the effects of the pandemic are far from over.

For most of the second half of 2020, we worked with the Ministry of Health (Uganda) to do social media analysis on public discussions about COVID-19. At this one year landmark, the Ministry of Health requested a follow-up analysis to find out what the Ugandan public generally thinks of the current state of events including the recently dropping numbers of recorded COVID-19 cases in Uganda, the rollout of vaccinations, the continuing curfew, and any other COVID-related issues.

We did the analysis on social media data (Twitter and Facebook), over a period of about 12 months, starting from March 2020 to the end of February 2021.

To carry out analysis, we developed a two-part system:

  1. A pipeline to fetch tweets from the Twitter API and posts from Facebook (through the CrowdTangle API) and store them for analysis.
  2. A machine learning model (our BERT classifier named SunBERT classifier) that we trained using these tweets/posts to predict whether a tweet is COVID-related or not.

Using Twitter’s new Academic Research API, we collected over 1.9 million Ugandan tweets in the period between March 2020 and February 2021. Using the SunBERT Classification Model we developed, we found that approximately 50,000 out of the 1.9 million tweets were related to COVID-19, and most of those were in during March and April 2020. Below is the monthly distribution of COVID-related tweets from our analysis:

Let’s look at an analysis of COVID-related tweets in Uganda over this time period:

It is evident that the discussion of COVID-related issues on Twitter was very high when the pandemic had just begun in March 2020 and has been steadily falling as the Ugandan public has become less and less interested in the pandemic discussion.

Comparing this trend alongside the number of new COVID cases in Uganda (according to Worldometers) reveals a surprising lack of correlation between the two. For example, there was a spike of cases in November but people had got tired of discussing COVID by then, as shown in the graph below:

What has been discussed recently?

Despite the relatively few tweets about COVID-19 in Uganda, there were still quite a number of interesting ones that revealed some underlying sentiments that Ugandans had about the pandemic. Let’s explore this in the following case study:

A case study of February 2021

In February 2021, only around 0.6% of tweets in Uganda were related to Covid-19.

Of these, some messages were expressing the pandemic in Uganda to be over, or not to be of significance, as the examples below show:

“Uganda Covid free 💪🏽”

“Now we register only 12 new cases of Covid 19? Small small 12?”

“What was the fear for. Covid was really hyped”

There were also some other themes of discussion that came up repeatedly. Below are a few examples:

Questioning the need for continued restrictions

“Why is there still a curfew in Uganda?”

“When will curfew be lifted? Asking on behalf of everyone.”

“Sincerely tweeting, why do we still have curfew in Uganda????”

Vaccine hesitancy

“Are we sure vaccines are safe anyway?”

“Do we really need #COVID19 vaccine as UGANDA?”

Testing – the associated expense and need to test

“Testing for covid19 in uganda… it’s like a privilege!”

“I’ve tested for covid 8 times since covid came. never tested positive. tonight i sit here to think about my ka money 😫”

Presence of Covid – Is it there or not? / We should live with it

“Do our leaders know that covid19 isn’t just  a “period” …. this thing is gonna hit us for a long time. We are gonna have to eventually learn how to live with it ….just like we did with HIV n a bunch of other diseases n conditions”

“Why would we import covid vaccines when covid does not exist in the country 🤔”

Let’s also take a look at the popular tweets within this time period. These seemed to mostly discuss the above topics, some by being very grave about it, and others by trying to present them in a comedic way. A look into a few of these tweets shows this:

Popular tweets from last 4 months (Feb 2021)

“One day I will tell you guys how my mom nearly married me off during the lockdown and how I had to sit her down and give her the “I am not like that kind of girl” speech 😹”

“I have seen great businesses and enterprises close during this Covid19. I have seen renown rich people struggle to provide for their families because their income has been frustrated. If you still have a meal everyday and a roof over your head, count yourself blessed.”

“If there is just one thing I pray for every day is that covid ends and SOPs for taxis are removed. The fact that the common man has to pay twice as much as they used to in order to go to work is heart breaking. I really pray that taxi fares go back to normal soon 🥺.”

“Aaaahhh… but African governments bought tents and cars when rich countries were investing in vaccine research.”

Conclusion

The analysis above has shown us the reducing interest in discussion about COVID-19 in the past months. One can only hope that that does not translate into the dismissal of Standard Operating Procedures (SOPs) and all other safety measures, for the sake of our health in the midst of this pandemic.

Radio Advert Analysis for Covid-19

Ever been seated there listening to the radio and then you hear an advert about how COVID-19 spreads and how to stay safe? At Sunbird AI, that’s music to our ears.

Our most recent project has been monitoring and analyzing Ministry of Health adverts on the spread of COVID-19 and related safety measures.

The analysis is to track whether COVID-19 adverts are played on radio stations and how frequently they are played. This is important because the broadcasting of information to the public about COVID-19 is a priority, in order to keep us healthy and safe.

This project was implemented in a number of steps:

Getting radio data

First, we have to have the radio data, and that means listening to a whole lot of radio. More than 300 stations, to be exact. And yes, I’m joking, of course, we did not manually listen to all that radio.

We went through a digital data collection process as described below:

  • Compilation of a file with streaming URLs for a number of radio stations whose streaming URLs were easily accessible
  • Writing a Python script to get to each of those streaming URLs and record the data for an hour at a time
  • Writing a cron job to run this script every top of the hour, for most of the hours of the day

Storing the data

By now you could be wondering about the huge amount of storage space that all this data would take up with time. In order to save storage space, a data retention policy is required. A data retention policy within an organization is a set of guidelines that describes which data will be archived, how long it will be kept, what happens to the data at the end of the retention period, and other factors concerning the retention of the data. In our case here, we store the radio recordings on our server only for the current day. At the end of each day, the recordings are backed up to cloud storage and then deleted from the server.

Annotating

Before fingerprinting the recordings, we had to find a way to test the accuracy of that method. If the results of fingerprinting say that there are two instances of the advert in a certain recording, we had to know for sure how true that was.

This meant that we would have to have some form of labeled data, labeled by we the human beings, in order to prove the computer right. We chose out samples of our huge pile of data and annotated them using Audacity, an amazing audio software.

A glimpse into what the data annotation process looks like:

 

 

Here is a sample of annotated data for an hour of radio:

 

 

As the image shows, there are a lot of different things that go on in just an hour of radio, but what we are looking out for are the Ministry of Health COVID-19 adverts that run for just about a minute. Now that we know that the advert features in this particular hour, we can run the fingerprinting and see if it comes up with the same result.

Fingerprinting

First, what does fingerprinting even mean?

Audio fingerprinting is the process of digitally condensing an audio signal, generated by extracting acoustic relevant characteristics of a piece of audio content.

The short version of this is that it finds a way of identifying a piece of audio.

For our project, we ran a fingerprinting script using a Python tool called dejavu, with the aim of identifying the instances of the COVID-19 adverts played within the radio recordings.

Conclusion

After this process, what we get is the ability to choose any radio recording and find out the number of times the COVID-19 adverts are played. This can be extended according to what is required at a given time, for example checking how frequently the adverts play on an entire day on a particular radio station, or checking how they are dispersed throughout the day.

In that way, we achieve the goal of tracking the broadcasting of crucial COVID-19 information to the public, to make sure that safety information becomes common knowledge.

Perspectives From Uganda Social Media Data On Covid-19 Measures

This era of COVID-19 has brought with it many things, like social distancing, working from home, and a whole lot of data. Data about the origins of COVID-19, its spread, prevention, reactions and responses, all of which is being constantly studied and analyzed all over the globe.

 

A good amount of this data comes from social media, given that we live in the social media age. Social media is a data source that is increasingly popular in Uganda, as is shown here. In fast-changing situations, it allows for understanding public perception and the evolution of public discussions. It also sheds light on public misconceptions and can even be a way to assess levels of public engagement.

 

At Sunbird AI, we are making our contribution to the study of this data by working on a social media analysis project. In this project, we analyze the Ugandan public response to the COVID-19 era along with the new policies it comes with. 

 

Our focus so far has been on Twitter, which provides a special kind of data because people voice their thoughts and reactions to events in real-time. Twitter data can capture what exactly people feel about something even as it is unfolding.

 

Case study: masks

An example of the analysis we did was a case study on the reactions and discussions about the issue of compulsory face masks. This analysis was carried out in the days right before and right after it was announced that a free face mask would be availed to every Ugandan. 

 

Here is a graph showing the tweets about it over a number of days:

 

 

 

We also analyzed the mask-related tweets by general subtopic, as shown below:

 

   

 

 

Outcomes

From this analysis, we were able to clearly gauge where the major interests of the public lay in relation to the issue of masks. As shown in the above image, the bulk of the masks discussion was about the implementation: how the exercise of distributing masks would be carried out. There were also concerns about the use of masks, i.e whether it is compulsory to wear a mask, which kinds of masks to wear and how to wear them, as well as a few political concerns.

 

The image below shows the major themes from the tweets about masks:

 

Implementation

Our implementation of this project consisted of writing a Python script that sends requests to Twitter’s API and retrieves tweets along with other related information like likes, replies, and hashtags. Then there was a data visualization step using multiple visualization libraries in Python.