alexa-skill-analysis.org Open in urlscan Pro
85.13.142.84  Public Scan

Submitted URL: http://alexa-skill-analysis.org/
Effective URL: https://alexa-skill-analysis.org/
Submission: On November 17 via api from US — Scanned from IT

Form analysis 0 forms found in the DOM

Text Content

A PRIVACY & SECURITY ANALYSIS OF THE ALEXA SKILL ECOSYSTEM

Start
 * Start (current)
 * Findings
 * Paper


OUR RESEARCH SECURITY & PRIVACY ISSUES WITH ALEXA SKILLS



Amazon's voice-based assistant, Alexa, enables users to directly interact with
various web services through natural language dialogues. It provides developers
with the option to create third-party applications (known as Skills) to run on
top of Alexa. While such applications ease users' interaction with smart devices
and bolster a number of additional services, they also raise security and
privacy concerns due to the personal setting they operate in. This paper aims to
perform a systematic analysis of the Alexa skill ecosystem.

Learn about our findings



FINDINGS

We perform the first large- scale analysis of Alexa skills, obtained from seven
different skill stores totaling to 90,194 unique skills. Our analysis reveals
several limitations that exist in the current skill vetting process. We show
that not only can a malicious user publish a skill under any arbitrary
developer/company name, but she can also make backend code changes after
approval to coax users into revealing unwanted information. We, next, formalize
the different skill- squatting techniques and evaluate the efficacy of such
techniques. We find that while certain approaches are more favorable than
others, there is no substantial abuse of skill squatting in the real world.
Lastly, we study the prevalence of privacy policies across different categories
of skill, and more importantly the policy content of skills that use the Alexa
permission model to access sensitive user data. We find that around 23.3 % of
such skills do not fully disclose the data types associated with the permissions
requested. We conclude by providing some suggestions for strengthening the
overall ecosystem, and thereby enhance transparency for end-users.

Auto-Enabling the wrong skill Faking Developer Names Dormant Intents Bypassing
Permissions Squatting Patterns Privacy Policies


AUTO-ENABLING SKILLS MANY SKILLS SHARE THE SAME INVOCATIONS

Over the years, Amazon has made it easier for users to enable Alexa skills. When
Amazon first introduced Alexa, users had to enable skills either through the app
or through their online account. In 2016, it became possible to explicitly
enable skills with a voice command, and since mid 2017, Alexa now automatically
enables skills if the user utters the right invocation name, favoring native or
first-party skills that are developed and maintained by Amazon. Amazon, however,
does not prevent non-native skills from sharing the same invocation name. The
actual criteria that Amazon uses to auto-enable a skill among several skills
with the same invocation names is unknown to the public. We, therefore, attempt
to infer if certain skill attributes are statistically correlated with how
Amazon prioritizes skills with the same invocation name.

Finding: Due to the lack of transparency on how Amazon auto-enable skills with
duplicate invocation names, users can easily activate the wrong skill. While
there is a positive correlation between a skill being activated and the number
of ratings it receives, it does not imply causation as the auto-enabled skill
appears on users’ companion app and there by making it easier for users to
provide ratings.


FAKING DEVELOPER NAMES AN ATTACKER CAN PUBLISH SKILLS USING WELL-KNOWN COMPANY
NAMES

When a skill is published in the skill store, it also displays the developer’s
name. We found that developers can register themselves with any company name
when creating their developer’s account with Amazon. This makes it easy for an
attacker to impersonate any well-known manufacturer or service provider. As
Amazon displays the developer’s name on a skill page, users can be easily
deceived to think that the skill has been developed by an authentic source when
it has really been published by an attacker. This can help an adversary launch
phishing attacks especially for skills that require account linking.

Finding: An attacker can getaways with publishing skills using well-known
company names. This primarily happens because Amazon currently does not employ
any automated approach to detect infringements for the use of third-party
trademarks, and depends on manual vetting to catch such malevolent attempts
which are prone to human error. As a result users might become exposed to
phishing attacks launched by an attacker.


DORMANT INTENTS BACK-END CODE CHANGES AFTER APPROVAL

Amazon sets requirements for hosting code in a backend server that governs the
logic of a skill. However, these requirements involve ensuring the backend
server responds to only requests signed by Amazon. During the verification
process, Amazon sends requests from multiple vantage points to check whether the
server is responding to unsigned requests. However, no restriction is imposed on
changing the backend code, which can change anytime after the certification
process. Currently, there is no check on whether the actual responses (logic)
from the server has changed over time. Alexa, blindly converts the response into
speech for the end-user. This can enable an attacker to craftily change the
response within the server without being detected. While this may sound benign
at first, it can potentially be exploited by an adversary who intentionally
changes the responses to trigger dormant, registered intents to collect
sensitive data (e.g., phone number).

Finding: An attacker can register any number of intents during the certificate
process, whether or not all intents are used. Note that Amazon first parses
human speech to identify data (e.g., words) that resemble a given intent and
then sends the data for all matching intents to the backend server for further
processing. There is no restriction as to how many intents a skill can register,
only that matching intents will be triggered. Thus, an attacker can register
dormant intents which are never triggered during the certification process to
evade being flagged as suspicious. However, after the certification process the
attacker can change the backend code (e.g., change the dialogue to request a
specific information) to trigger dormant intents to collect sensitive user data.

Video not supported


BYPASSING PERMISSIONS ACCESS PERMISSION PROTECTED DATA TYPES WITHOUT USING THE
PERMISSION APIS

Alexa skills can be configured to request permissions to access personal
information, such as the user’s address or contact information, from the Alexa
account. Similar to permissions on smartphones, users enabling these skills must
grant permission upon activation. These permissions can make interaction with a
skill much more convenient, e.g., a weather skill with access to device address
can report relevant weather forecasts based on the user’s location. Permissions
allow access to the following data types: device address, customer name,
customer email address, customer phone number, lists read/write, Amazon Pay,
reminders, location services and skills personalization. However, we found
instances where skills bypass these permission APIs and directly request such
information from end-users. After manually vetting the candidates we found a
total of 358 unique skills potentially requesting information that is protected
by a permission API.

Finding: Alexa does not properly mediate the intent of sensitive data types. An
adversary can directly request data types that are structured to be protected by
permission APIs. Even when the attacker uses the built-in data type, like
Amazon.Phone for an intent, the skill does not get flagged for requesting
sensitive data.


BYPASSING PERMISSIONS REAL EXAMPLES


SKILL SQUATTING SQUATTING PATTERNS

While we found four common approaches for squatting an existing skill, we did
not find any systematic malicious abuse of skill squatting in the wild. The
non-evidence of malicious skill-squatting is a valuable data-point for the
research community, as previous works have focused on showcasing how skills can
be squatted without validating the prevalence and impact in the real world.
However, it should be noted that the cause of non-detection could have been due
to mitigation strategies enacted by Amazon, which may have been influenced by
prior work.

Finding: Certain approaches within each skill-squatting pattern have a higher
likelihood of successfully squatting skills. For the different spelling types
and homophones, we saw that correct/accepted spelling increased the likelihood
of launching the expected skill over its variants with additional or altered
letters. However, for punctuation appropriate usage reduced its chance of being
activated. And for word-spacing, joint words succeeded most of the time.


PREVALENCE OF PRIVACY POLICIES ONLY 24.2% OF SKILLS HAVE A PRIVACY POLICY.

Amazon enables skill developers to provide a privacy policy link addressing how
data from end-users is collected and used. However, Amazon does not mandate a
privacy policy for all skills, rather only for skills that request access to one
or more of their permission APIs. We, therefore, analyze the availability of
privacy policy links in the US skill store. We found that around 28.5% of the US
skills provide a privacy policy link.

Finding: For certain categories like ‘kids’ and ‘health and fitness’ only 13.6%
and 42.2% skills have a privacy policy, respectively. As privacy advocates we
feel both ‘kids’ and ‘health’ related skills should be held to higher standards
with respect to data privacy. The FTC is also closely observing skills in the
‘kids’ category for potential COPPA violations.




COVERAGE OF PRIVACY POLICIES SKILLS DO NOT DESCRIBE WHAT THEY ARE DOING IN THE
PRIVACY POLICIES

Skills by default are not required to have any accompanying privacy policies.
However, any skill requesting one or more permissions must have an accompanying
privacy policy for it to be officially available in the skill store. Users
enabling these skills must grant permission to these APIs upon activation. These
permissions can make interaction with a skill much richer, e.g., a weather app
with access to device address would know which location’s weather to report when
asked. It is however, unclear if the privacy polices properly address (i.e.,
explicitly state the data collection and share practices) the permissions
requested.

Finding: For skills requesting access to sensitive data protected by the
permission APIs, around 23.3% of their privacy policies do not fully disclose
the data types associated with permissions requested.


COVERAGE OF PRIVACY POLICIES REAL EXAMPLES

For a set of 16 skills requesting the Postal Code and Device Address permissions
(e.g.,B072KL1S3G, B074PZQTXG, B07GKZ43J5), we found similarly potentially
deceptive statements within the privacy policy (“We never collect or share
personal data with our skills”)




THE PAPER

Our research will be presented at the Network and Distributed System Security
Symposium (NDSS) in February 2021.


ACKNOWLEDGEMENTS

We thank our anonymous reviewers for their feedback. This material is based upon
work supported in parts by the National Science Foundation under grant number
CNS-1849997 and the state of North Rhine-Westphalia. Any opinions, findings, and
conclusions or recommendations expressed in this material are those of the
authors and do not necessarily reflect the views of the National Science
Foundation.


REFERENCE

Christopher Lentzsch, Sheel Jayesh Shah, Benjamin Andow, Martin Degeling, Anupam
Das, and William Enck. Hey Alexa, is this Skill Safe?: Taking a Closer Look at
the Alexa Skill Ecosystem. In Proceedings of the 28th ISOC Annual Network and
Distributed Systems Symposium (NDSS), 2021.


BIBTEX

@inproceedings{ alexa-skill-ecosystem-2021,
         author  = {Christopher Lentzsch and Sheel Jayesh Shah and Benjamin Andow and 
           Martin Degeling and Anupam Das and William Enck},
         title   = {Hey {Alexa}, is this Skill Safe?: 
           Taking a Closer Look at the {Alexa} Skill Ecosystem},
         booktitle = {Proceedings of the 28th ISOC Annual Network and 
           Distributed Systems Symposium (NDSS)},
         year   = 2021
       }
              

Download the paper Access the data


CONTACT

Christopher Lentzsch (Ruhr-Universität Bochum) christopher.lentzsch @ ruhr-uni-bochum.de
Sheel Jayesh Shah (North Carolina State University) sshah28 @ ncsu.edu
Benjamin Andow (Google) andow @ google.com
Martin Degeling (Ruhr-Universität Bochum) martin.degeling @ ruhr-uni-bochum.de
Anupam Das (North Carolina State University) anupam.das @ ncsu.edu
William Enck (North Carolina State University) whenck @ ncsu.edu
              

Contact & Copyright

Martin Degeling Ruhr University Bochum Universitätsstr. 150 44801 Bochum
martin.degeling at ruhr-uni-bochum.de

Privacy Policy

This website is hosted with all-inkl.com in Germany. We do not use third parties
scripts or cookies or store any logs, but when you visit the site some
information about the device you are using (IP-Address, Browser type etc.) will
be transferred and temporarily processed, because that's how the internet works
(kind of Art. 6 (1) a of the GDPR). You have the rights to withdraw consent,
limit process, access or transfer your data - that will not work, as we do not
store anything, but feel free to contact us with any privacy concerns or write a
complaint to the data protection authorities directly.