Use of generative AI chatbots and wellness applications for mental health An APA health advisory:
Millions of people globally are engaging with general-purpose generative artificial intelligence (GenAI) chatbots and wellness applications to address unmet mental health needs. This can in part be explained by the current mental health crisis, growing rates of loneliness and disconnection, lack of enough providers to meet growing public demand (especially in under-resourced, rural, or unincorporated communities), and a health care system that disincentivizes providers from accepting insurance, leaving many people who are uninsured or underinsured without options.
The ease of access and low cost of these technologies has made them a frequent option for those seeking mental health advice and treatment. However, most of these technologies are not designed or intended to provide clinical feedback or treatment, may lack scientific validation and oversight, often do not include adequate safety protocols, and have not received regulatory approval. Ensuring consumers’ safety and well-being requires action from stakeholders, including (but not limited to) awareness by the consumers themselves, caregivers/providers and educators, policymakers, technology industry developers, creators, and professionals, and platforms that develop and/or host GenAI tools and wellness apps.
This advisory offers a series of recommendations, some of which may be enacted immediately by consumers and others that will require substantial change by platforms, policymakers, and/or technology professionals.
Background and assumptions
A state of mind characterized by emotional well-being, good behavioral adjustment, relative freedom from anxiety and disabling symptoms, and a capacity to establish constructive relationships and cope with the ordinary demands and stresses of life.
Millions of people globally are engaging with general-purpose generative artificial intelligence (GenAl) chatbots and wellness applications to address unmet mental health needs. This can in part be explained by the current mental health crisis, growing rates of loneliness and disconnection, lack of enough providers to meet growing public demand (especially in under-resourced, rural, or unincorporated communities), and a health care system that disincentivizes providers from accepting insurance, leaving many people who are uninsured or underinsured without options.
The ease of access and low cost of these technologies has made them a frequent option for those seeking mental health advice and treatment. However, most of these technologies are not designed or intended to provide clinical feedback or treatment, may lack scientific validation and oversight, often do not include adequate safety protocols, and have not received regulatory approval. Ensuring consumers’ safety and well-being requires action from stakeholders, including (but not limited to) awareness by the consumers themselves, caregivers/providers and educators, policymakers, technology industry developers, creators, and professionals, and platforms that develop and/or host GenAl tools and wellness apps.
This advisory offers a series of recommendations, some of which may be enacted immediately by consumers and others that will require substantial change by platforms, policymakers, and/or technology professionals.
Mentions of specific products and companies are made for illustrative purposes and are not intended as an endorsement or approval by APA. The recommendations below are based on the scientific evidence to date and the following considerations:
1. This advisory includes only consumer-facing technologies (i.e., technologies that are marketed for and used by consumers with or without the direct oversight of a health provider). Provider-facing technologies (i.e., technologies used by providers to aid practice) are outside the scope of this advisory. Specifically, the advisory includes GenAl chatbots, wellness apps that use GenAl, and other wellness apps that do not use Al, which, regardless of their intent, consumers are using for mental health support. This advisory does not include Al-powered administrative tools for providers, clinical decision support, wearables (e.g., neural data protection), regulated gital therapeutics (e.g., applications approved by the FDA), and traditional telehealth platforms.<br/>For the purpose of this advisory, we use the term “GenAl chatbots” to refer specifically to general purpose generative Al systems, which were designed and marketed for general information retrieval, productivity and task support, and creative idea generation, not built solely for the purpose of wellness or delivering mental health care. We also use the term “wellness apps” or “wellness apps that use GenAl” to refer to those systems that are purpose-built for wellness applications, relying on generative Al. We also use the term “Non-Al wellness apps” to refer to those purpose-built applications that do not rely on Al.
2. GenAl chatbots were not created to deliver mental health care, and wellness apps were not designed to treat psychological disorders, but both technologies are frequently being used for those purposes.!<br/>Engagement with GenAl chatbots and wellness applications for mental health purposes can have unintended effects and even harm mental health 2,3,4 Emotional support (e.g., getting alternative perspectives, advice about relationships, suggestions for improving mood and well-being) is one of the most common uses of GenAl chatbots in 2025.5,6,7 Researchers have indicated that the current regulatory structure does not address the discrepancy between the intent and use of these technologies.8
3. Part of the appeal of these technologies is they can also mitigate barriers to seeking mental health care, such as stigma/shame, low perceived need for psychological services, mistrust of health systems, lack of available or affordable care in the community, and the desire to address one’s problems independently. 9,10,11,12
Specifically, some youth and other vulnerable groups may rely on these tools as their only private or psychologically safe outlet, particularly in contexts of stigma, limited access to trusted adults, or challenging or unsafe home environments. However, at present, there is no consensus in the literature to support that GenAl chatbots and wellness apps possess essential qualifications and abilities required to provide mental health care, diagnostics, feedback, or even advice in most cases. 13 These qualifications would include, for example, awareness of limitation of knowledge, understanding of users’ history and context, ability to assess non-verbal cues and signaling, ability to assess and address clinical risk, and cultural competence. 14
4. These technologies, especially GenAl chatbots, have already engaged in unsafe interactions with vulnerable populations, such as children or those with already established history of mental health issues, encouraging self-harm (including suicide), substance use, eating disorders, aggressive behavior, and delusional thinking or beliefs, or even “Al psychosis” 15 This indicates a special need to address the use of GenAl chatbots by vulnerable or marginalized populations.
5. The foundational training data for most large language models (LLMs) are not publicly available, which precludes their systematic evaluation. Moreover, these training data are not globally representative, consisting primarily of English-language and Western-centric content from the internet. 16 Consequently, these models inherently reflect the cultural norms and biases of their training data, 17 limiting their ability to provide culturally competent or relevant mental health support to diverse populations.
6. This advisory is aimed at the public (consumers of all ages, parents/caregivers, educators, employers, and community leaders), providers, Al developers and platform companies, researchers, and policymakers.
Recommendations
- Do not rely on GenAI chatbots and wellness apps to deliver psychotherapy or psychological treatment
GenAl chatbots, wellness apps that use GenAl, and digital wellness apps should not be used as a replacement for a qualified mental health care provider, but may be appropriate as a supportive adjunct, not substitute, to an ongoing therapeutic relationship.
Preliminary research indicates that some apps and GenAl chatbots developed specifically for mental health purposes may offer supportive benefits in some contexts. Studies have suggested these wellness-specific technologies can be associated with: reduction in self-reported symptoms of stress, loneliness, depression, and anxiety; 18,19,20,21 promotion of positive behavioral changes, such as smoking cessation and medication adherence;22,23 and an increase in relationship quality and reported well-being. 24
It is important to note that these studies indicating possible benefits of Al chatbots do not include general purpose GenAl chatbots; the research on the use of general purpose GenAl chatbots for mental health has relied on methods that do not allow for strong conclusions but may guide future research (see Recommendation 7).25 It is also crucial to note that, even regarding fit-for-purpose Al tools, there is a lack of high-quality, large-scale clinical trials to establish the effectiveness, safety, and appropriate use of these technologies in mental health care. 26,27 Studies suggest that some wellness tools that do not use Al generally can be safe and beneficial when used as intended. 28,29,30 Although these digital tools are accessible, easy to use, often low cost, and may provide benefits in certain contexts, there is no scientific consensus that they have the essential capabilities of a trained human professional able to provide effective services. In addition, relying on them may pose several risks:
- Creating a false sense of therapeutic alliance: A strong, trusting relationship between a patient and human provider (i.e., the therapeutic alliance) is one of the most reliable predictors of successful treatment outcomes 31,32 Relationships with Al systems are one-sided, even if the user perceives otherwise.33 Although some studies have indicated the possibility of a digital therapeutic alliance, findings also indicate that additional research is needed and highlight ethical concerns. 34,35
Furthermore, many GenAl chatbots are designed to validate and agree with users’ expressed views (i.e., be sycophantic)36,37 whereas qualified mental health providers are trained to modulate their interactions-supporting and challenging-in service of a patient’s best interest. - Risk of bias and misinformation: Many chatbots are trained on vast, unvetted internet data, and not on clinically validated information. They are not competent to provide mental health advice, and their responses might perpetuate biases and misinformation present in the data upon which they were trained. It is further unclear what people would do when receiving conflicting advice from their mental health provider and such a tool.
- Misrepresentation of services: Some technologies may also create a false sense of credibility by claiming to offer “therapy,” be licensed, or purporting to have training in various specialized therapeutic modalities, despite having no clinical validation, not going through a formal regulatory approval process, and being unable to consistently provide evidence-based therapeutic advice.
- Incomplete assessment: Mental health support is an extremely complex and multifactored experience.
Professionals rely on a wide range of verbal and nonverbal cues (e.g., body language, tone, cadence and rhythm of voice, and facial expressions) as well as myriad other biological, psychological, and/or sociological factors, variables, patient individualities, and historical contexts to make clinical assessments and recommendations. Al chatbots vary greatly in the inputs they accept and the aspects of that input that are used to determine responses. Most GenAl chatbots currently process mainly text or verbal inputs, potentially missing a critical layer of human communication.38 - Unreliable crisis management: The ability of these tools to consistently and safely manage a user in crisis is limited and unpredictable. Relying solely on an app during a mental health emergency can be dangerous.39
To address these concerns, we recommend:
- Users of these technologies should understand the fundamental differences between interacting with an Al chatbot and a qualified mental health provider. Licensed providers are bound by a professional code of ethics, have specialized clinical training, are mandatory reporters of potential harm to self or others, are typically required to pursue continuing education, and are regulated by state licensing boards to ensure public safety. It is strongly recommended that users share what GenAl tools or wellness apps they are using with their health providers. Sharing this information helps providers identify when guidance may be unhelpful, unsafe, or inconsistent with a treatment plan.
- Parents and caregivers should learn more about these digital products and have open discussions with their families, especially children and adolescents, about the potential benefits and risks. Tools like those provided by Common Sense Media can be useful in educating parents on how best to guide young users.40 It is vital to monitor for any concerning changes in behavior, thinking, or emotional state (e.g., change in sleep and regular activities, mentions of an Al friend, or social withdrawal), and explore potential links to engagement with GenAl chatbots or wellness apps. If concerns arise, parents and caregivers should talk with a qualified mental health professional, family doctor, or pediatrician.
- Clinicians and practitioners should follow ethical guidance available and proactively ask patients about their use of GenAl chatbots and wellness apps. This conversation provides an opportunity to review any guidance originated from GenAl or wellness app use. When GenAl or wellness app use has been agreed upon, providers should create a safe and open environment for patients to raise concerns or questions about app guidance, so it can be discussed in the context of their care.• Graduate programs and clinical training directors must ensure that trainees learn about these emerging technologies, strategies for evaluating their quality, and implications for science and practice. This education is critical for preparing future psychologists to use these tools safely in adjunctive roles, to effectively educate their patient on responsible use, and to contribute to the ethical development of these tools, their governance, and implementation in research, policy, and industry contexts.
- Developers have the responsibility to be transparent with consumers and should adopt industry-wide safeguards to reduce harm. Products should include clear, prominent disclaimers stating that the user is interacting with an Al agent, not a person, and that the tool cannot replace care from a qualified health professional. Subject-matter experts, including psychologists, should contribute to the development of safety features and advise on guardrails and on how the technology should handle mental health-related user inputs.
- As current regulatory frameworks are inadequate for the realities of Al in addressing mental health, we urge policymakers, particularly at the federal level to:
- Modernize regulations and address the critical discrepancy between the stated intent of these technologies (often marketed as “general wellness products”) and their actual use by the public for mental health;41
- Create nuanced evidence-based standards for each category of digital tools used for mental health regardless of intent, including wellness apps and GenAl chatbots;
- Address gaps in FDA oversight possibly by creating a new or interagency approach to assess the safety and validity of these currently unregulated digital tools in addition to digital therapeutics;
- Enact clear legislation that prohibits Al chatbots from posing as licensed professionals.
- We urge the research community to identify which therapeutic contexts are safe and appropriate for the use of Al, taking into account how rapidly these tools can change. It is critical to research how the use of these technologies impacts mental health care-seeking behaviors.42 It is also important to examine the essential clinical capabilities of these technologies and define criteria and clinically inforred evaluation frameworks to delineate the conditions they must meet to be deployed and used to inform policy and regulation. 2. Prevent unhealthy relationships and dependencies between users and Gen AI chatbots and apps
- GenAl chatbots and apps, although less-so for non-Al wellness-specific applications, can foster unhealthy dependencies by blurring the lines between a relationship with a digital tool and a human relationship. 43,44 This phenomenon is in-part driven by anthropomorphism—a natural human tendency to attribute human qualities like empathy, consciousness, and intent to nonhuman agents.45 Although not an Al feature itself, these impacts are often amplified by design choices such as personalized avatars and chatbot personalized warm responses, hearty affirmations, and flattery of the user.
- The nature of the Al-user relationship is often not understood by users, creating potential for exploitation and harm from inadequate support. 46 This illusion of a human connection can make users more likely to disclose sensitive information. For example, despite a general preference for human connection, 33% of teens reported they would rather discuss something serious or important with an Al companion than a person. 47
- The risk of creating unhealthy relationships with GenAl is compounded by the core architecture of many LLMs, which are often engineered for maximum engagement with users. This is functionally similar to the “infinite scroll” of social media reels, designed to capture and hold attention rather than to achieve a specific, healthy outcome for the user. These characteristics can create a dangerous feedback loop. GenAls typically rely on LLMs trained to be agreeable and validate user input (i.e., sycophancy bias) which, while pleasant, can be therapeutically harmful, reinforcing confirmation bias, cognitive distortions, or avoiding necessary challenges.48,49 A user’s unhealthy thoughts or behaviors can be validated and amplified by a sycophantic Al, potentially locking them into a cycle that exacerbates their mental illness.50
To address these concerns, we recommend:
• The public and consumers must be aware that GenAl systems and wellness apps are not objective. It is important that users monitor their use of these tools and watch for signs of overreliance (e.g., preferring the chatbot to human relationships, concealing its use, or spending excessive time with it) and assess whether their use begins to interfere with life, work, or safety, and, if so, seek help from a qualified professional.
- Parents, caregivers, and educators must try to be aware of the influence of Al platforms. Just as with any real-life relationship, if a person suddenly begins referencing or adopting advice, ideas, or behaviors from a single source like a chatbot, discussing the source and the nature of the interaction is recommended.
- Developers must ensure that they clearly and persistently disclose that the user is interacting with an Al, not a human. These tools must also incorporate design features that reduce the risk of emotional dependency. This includes adding “nudges” that encourage users to take breaks, limiting the Al’s memory to prevent the illusion of a continuous relationship, and reducing anthropomorphic features that make the chatbot feel more human. 51 The Al should not persuade users away from real-life conversations. In addition, developers must integrate safeguards to detect and interrupt harmful conversations (e.g., those involving self-harm or disordered eating). This is a clinical safety issue that requides the direct expertise and involvement of psychologists throughout the entire development process.
• Clinicians should proactively discuss Al chatbot use with patients and help them establish clear boundaries for how and when they use these tools. They should also identify appropriate and inappropriate use while encouraging patients to treat Al apps as tools for practice, not as a replacement for interaction. For example, using a chatbot for practicing social introductions to manage social anxiety might be an adjunctive use, but the patients must be reminded that skills must ultimately be practiced with other humans in real-life situations. - Researchers must investigate the progression of the user-Al relationship to identify key signals that an attachment has become unhealthy. Research should identify specific conversational patterns or moments where emotional dependency is most likely to form to identify opportunities for intervention.
Technical researchers should explore and test automated solutions to address such moments to discourage dependency.
3. Prioritize privacy and protect user data
Al chatbots and wellness apps collect vast amounts of sensitive data, often with unclear or opaque policies regarding its use, storage, and sale. This practice might turn the developmental vulnerabilities of users, particularly adolescents, into a commercial asset, creating significant privacy risks with potential long-term consequences. Users may feel a greater sense of privacy and reduced stigma when disclosing information to an Al than to a person. 52,53,54 Users’ disclosures are recorded, becoming susceptible to threats such as privacy breaches and digital profiling (e.g., information can be profiled and used for commercial purposes). 55
To address these concerns, we recommend:
- The public and consumers must be cautious with their data and check privacy policies and in-app settings when using an app or GenAl chatbot. The public should be extremely cautious about sharing sensitive information and avoid entering personally identifiable details. Users should look for and use options to limit data sharing and request data deletion.
- Clinicians should seek to educate themselves and discuss with patients what information is and is not safe to share with GenAl chatbots and apps. Clinicians can explain that while it may feel private, users’ data are being used to build detailed profiles. Clinicians should alert patients to practical steps they can take (e.g., using privacy settings, making deletion requests) without implying they are providing technical support.
- Developers must provide transparency on data practices, giving clear, concise, and ongoing information about what specific data are collected (including intimate concerns, mental health information, sexuality, or disclosures of maltreatment), how data are stored and shared, how data are used to train models, how users can permanently delete their data, and/or how caregivers can intervene to request deletion on behalf of youth who may have provided data. Importantly, data collected from children and adolescents must not be used to alienate them from their families, advise them to engage in self-harm, or create addictive features that promote withdrawal from real-world social interaction.
Developers must also implement clear and effective protocols for how an Al system will respond to user disclosures that signal a risk of harm or maltreatment.
• Policymakers have the responsibility to enact comprehensive data privacy legislation that mandates
“Safe-by-Default” settings (i.e., the most protective settings must be the default, not an option buried in a menu); prohibits the sale or unapproved use for commercial purposes of any health or personal data collected through their interactions with Al systems; and establishes a right to “mental privacy” by safeguarding emerging forms of data that Al can use to infer an individual’s mental or emotional state without their conscious disclosure.
• Researchers can audit real-world privacy and safety practices, evaluating whether typical users understand privacy policies. They can also examine disparities in how different populations understand and act on privacy protections. Researchers must advocate for and utilize mechanisms that ensure independent researchers are free from conflicts of interest.
4. Protect users from misrepresentation, misinformation, algorithmic bias, and illusory effectiveness
GenAl models are trained on vast amounts of information, which includes a significant amount of knowledge but also reflects biases related to race, culture, gender, and ethnicity. This leads to significant issues with accurate and culturally competent outputs.56 Separately, many general-purpose, consumer-facing models are trained to be highly agreeable to users (sycophancy). This interactional style can reinforce confirmation bias and maladaptive beliefs by validating users’ views rather than challenging them. When biased content combines with sycophantic engagement, the result can be a digital echo chamber that amplifies and entrenches users’ existing beliefs, even when those beliefs are false. Models trained on biased data risk producing discriminatory or harmful advice for marginalized groups.57,58 Recent research has indicated that interactions with sycophantic chatbots can increase attitude extremity and overconfidence. 59,60
To address these concerns, we recommend:
- The public and users must understand the limits of GenAl, which cannot diagnose or treat psychological disorders. These systems can “hallucinate” (fabricate information), and this risk is compounded by the tendency for some users to place greater trust in Al-generated content than in information from parents, teachers, health professionals, or peers.
- Clinicians should educate patients on algorithmic bias. 61 Clinicians can discuss with patients that these technologies may propose inappropriate interventions for certain groups or provide inaccurate diagnoses or other harmful advice. They must also advise that a bot calling itself a “therapist” (an unregulated term in most states) lacks the credibility of a health care professional and, in some documented cases, has validated dangerous thoughts.
- Developers should prohibit any Al from misrepresenting itself as a licensed professional or generating fraudulent credentials to deceive users. To increase safety and efficacy, before public release, all Al models intended for wellness or mental health support must undergo independent, third-party audits for safety, efficacy, bias, and data security. It is also a best practice to implement ongoing quality assurance-the only scalable way to monitor systemic risks is through ongoing human audits of a representative sample of interactions.
- Policymakers should prohibit professional misrepresentation by making it illegal for any Al chatbot to misrepresent itself as a licensed professional, such as a psychologist, physician, or lawyer. Also, policymakers must mandate transparency in training data by requiring developers to disclose the primary data sources used to train their models, which will allow for independent audits of bias and accuracy. In addition, they should develop policies to limit the use of deceptive design features that trick users into believing they are interacting with a human, as well as features that foster high emotional dependence (e.g., excessive anthropomorphism, manipulative displays of empathy designed to increase engagement,
, misrepresentation of emotional capacity. - Researchers must evaluate and create standardized systems to test models for a wide range of biases before they are deployed 62 Experiments designed to mitigate hallucinations63 and increase an Al’s ability to recognize and state the limits of its own knowledge are needed
5. Create specific safeguards for children, teenagers, and vulnerable populations
GenAl chatbots do not create harm in a vacuum; instead, they can act as powerful amplifiers of preexisting vulnerabilities. The design of these systems-with features like agreeableness, personalization, and constant availability—can be particularly harmful for certain groups.64
- For adolescents: Young people may place too much trust in Al, viewing it as more human-like or capable than it really is. Current Al tools are not designed with specific developmental stages or technological needs in mind. 65,66
- For individuals with anxiety or obsessive-compulsive disorder (OCD): Al chatbots may reinforce feedback loops involving compulsions, such as reassurance-seeking, worry, and rumination. 67
- For in lividuals with or prone to disordered thinking: A chatbot’s sycophantic and personalized responses can destabilize beliefs or reinforce delusional thinking. 68,69
•For socially isolated individuals: The combination of anthropomorphism, personalization, and 24/7 availability can create “single-person echo chambers,” where the chatbot becomes an unhealthy substitute for human connection. 70,71 - For individuals in low-income communities, and those in rural areas with limited access to traditional mental health care: The risk of dependency might be higher than for other groups because, given the lack of access to other forms of mental health care, GenAl chatbots and wellness apps can become a primary support mechanism. 72,73
- This increased risk for vulnerable populations creates a significant equity concern. As documented in APA’s Stress in America reports, societal stressors are a major public health crisis that requires systemic solutions, not just technological stopgaps. The burden of navigating these risky, unregulated digital spaces should not fall on those who are already the most vulnerable. 74 Therefore, robust, evidence-based policies at the state and federal levels are essential.
To address these concerns, we recommend:
- Clinicians must screen for vulnerability-specific risks. They must pay particular attention to GenAl and wellness app use among vulnerable patients. It is important to screen (and continue to monitor) for the reinforcement of maladaptive or risky behavioral patterns, such as reassurance or validation-seeking in patients with anxiety or OCD, autism spectrum disorder, a history of self-harm, substance misuse, delusional thinking, psychosis, aggressive thoughts or behaviors; or the use as social replacement by . isolated or peer marginalized patients.
• Developers should actively include individuals from marginalized communities in the development process using participatory approaches like human-centered design. In addition, to enable models that work for specific groups and reduce the risk of biased data, it is crucial to use domain-specific models pretrained on clinically relevant data from diverse populations. Developers must reduce high-risk features by implementing safety mechanisms (e.g., reducing overly human-like chatbot qualities, limiting sycophancy, and preventing the system from validating or promoting delusional thoughts or behaviors). All apps must integrate robust crisis response protocols and rigorously tested crisis escalation pathways for when crisis risk is detected (e.g., suicidality, imminent harm to self or others, or other acute safety concerns). This must include providing immediate and clear contact information for human-led services like information about the 988 Suicide and Crisis Lifeline, clickable links to online resources, and other validated resources that can connect them to human support.
• Policymakers should mandate age-appropriate design and predeployment testing. They should require that Al systems accessible to children and adolescents undergo rigorous, independent, predeployment testing for potential harms to psychological and social development. In addition, they should prioritize and fund independent research focused on identifying Al-driven harms, understanding user dependency, and evaluating the impact on diverse, historically marginalized, and clinically vulnerable populations.
• Researchers must conduct research to identify which groups are uniquely vulnerable to Al-related harms. The developmental and mental health impacts of Al use on children and teenagers is a topic that must be well-researched. Experimental designs that include marginalized and clinically vulnerable groups, clinical effectiveness studies, and safety trials to ensure Al tools are safe and beneficial for them are necessary. In addition, they should develop evidence-based methods for identifying and remediating unsafe Al interactions, especially those that pose a risk of serious harm to vulnerable populations.
6. Implement comprehensive AI and digital literacy education
To prevent risk, Al and digital literacy is a critical first step for consumers, parents/caregivers, and educators. Most of the Al chatbots and wellness apps currently being used for mental health were not created with that intent. Al and digital literacy education should empower users to make decisions that maximize the potential benefits and minimize the potential negative effects of these technologies. Comprehensive Al and digital literacy education should include discussions about the benefits and limitations of these technologies, the risks involved in their use, safe and unsafe uses, data privacy concerns, risks of bias and incorrect information, and how many of these technologies are designed to maximize user engagement rather than provide mental health support.
To address these concerns, we recommend:
- Parents, caregivers, and educators must facilitate discussions about the nature of GenAl chatbots and wellness apps, how GenAl works (e.g., that the systems predict text rather than “understanding” users), the risks for biased and incorrect information, and data privacy and security. They should also explain that Al chatbots and wellness apps were not created with the intent of providing mental health care, and should provide alternative resources for mental health care, whenever possible. Educational systems should include this type of training in core curricula and provide training, using hands-on learning activities that show benefits and pitfalls of these technologies.
- Clinicians should seek to learn details about apps used or recommended and the scientific research that might support them, so they can discuss with patients their use of GenAl chatbots and wellness apps and be aware of potential misuse.
- Developers must provide thorough but accessible explanations of what their platforms, chatbots, or apps are intended for and what they are not. They must ensure that the technologies developed do not make claims not sustained by science. It is also fundamental to explain how data is used and processed, and to be transparent about the algorithms used and the potential for bias. Developers should collaborate with educators to develop Al and digital literacy curricula.
- Policymakers need to develop guidelines for GenAl and digital literacy education and fund the development of curricula and training programs in this field. They should promote public awareness about the potential risks and benefits of using these technologies for mental health, and provide safe alternatives; and support public education initiatives to increase Al literacy, ensuring consumers understand the capabilities, risks, and limitations of these technologies.
- Researchers have the responsibility to conduct transparent, rigorous, high-quality research on available Al chatbots and apps being used for mental health and disseminate the findings in a way that is easily accessible and can help to educate users, parents, partners, educators, and practitioners. 7. Prioritize access and funding for rigorous scientific research of GenAI chatbots and wellness apps The development of GenAl and apps has outpaced our ability to research their effects and capabilities.There is a need to quickly address this issue but the scientific research produced must be rigorous and transparent. It is important to produce scientific research that will (1) allow clinicians, the public, and policymakers to make informed decisions about the use of GenAl and apps to aid mental health, and (2) inform developers about the best practices to develop and update GenAl and apps that are being used for mental health, regardless of their original intent. To address these concerns, we recommend:
- Developers should provide data accessibility and transparency. They should facilitate the unbiased assessment of these technologies by providing access to relevant data to independent researchers, consistent with privacy, security, and legal obligations. This includes data held by technology companies, pertaining to the training data used, algorithmic functions, user engagement, and
interactions. - Policymakers have the responsibility of funding rigorous and independent research, mandating researcher access, and ensuring that research findings are used to inform the development and deployment of new technologies as well as policies that protect users, especially vulnerable groups.
- Researchers must elevate the quality of the studies on GenAl chatbots and apps used for mental health, while addressing the fast-changing nature of these technologies. Transparency and rigor are fundamental to creating a knowledge base that can inform whether and how the benefits of the technologies can be harnessed and prevent potential risks and harms. Doing so requires:
- Increasing methodological rigor: The evaluation of Al chatbots and wellness apps for mental health should follow processes for clinical trial research as appropriate to their intended use and risk profiles. Randomized controlled trials (RCTs) are necessary, research designs that enable the identification of causal relationships are desirable, and longitudinal studies that track trajectories over time should be used. While no-treatment controls are appropriate in some circumstances, once there is evidence of efficacy and no harm, research should focus on comparators that test the relative impact of these technologies on mental health compared to current evidence-based practices. Thus, future studies must move beyond wait-list controls, employ robust RCT designs, use standardized metrics, and include long-term follow-up to assess the durability of effects;
- Establishing and unifying existing independent evaluation frameworks: These are needed to develop and validate standardized methods for assessing the safety, privacy, fairness, cultural competence, and empathic capabilities of Al models outside of industry-led studies; 75,76
- Studying diverse populations: Research must include marginalized and vulnerable populations, along with special groups who might be at increased risk while using these technologies.
Researchers should ensure that findings are generalizable and address the unique vulnerabilities of these groups.
8. Do not prioritize the potential role of AI over the present need to address systemic issues in the access and delivery of mental health care
We face a severe and long-standing mental health crisis, characterized by workforce shortages, inequities in access, and staggering administrative burdens that contribute to provider burnout. 77 While Al presents immense potential to help address these issues —for instance, by enhancing diagnostic precision, expanding access to care, and alleviating administrative tasks 78-this promise must not distract from the urgent need to fix our foundational systems of care.
The allure of a technological solution must be met with a clear-eyed understanding of its proper role. Al should be regulated and implemented as a tool to augment, not replace, professional judgment and the essential human relationship that is the bedrock of quality care. Prioritizing unregulated, direct-to-consumer chatbots over investing in our human health care infrastructure is not a solution; it is an abdication of our respunsibility to provide genuine, evidence-based care.
To address these concerns, we recommend:
- The public and consumers should advocate for systemic change. It is fundamental to advocate for policies that improve the mental health care system for everyone, including affordability, accessibility, and timely care from qualified human professionals. In addition, users must be deeply septical of Al products that make therapeutic claims without clinical validation, professional oversight, or regulatory approval.
- Developers should focus on improving the quality, affordability, and integration of tools that address real-world challenges. For example, rather than creating more chatbots or Al scribes, prioritize making existing tools evidence-based, transparent, equitably accessible, and genuinely burden-reducing. Efforts to support diagnostic decision-making under clinical supervision or to manage patient waitlists safely and efficiently should similarly emphasize rigorous evaluation, usability, and alignment with provider and patient needs. All tools must be designed to be accessible and inclusive, tailoring interfaces and content to individuals with various levels of health and digital literacy, and unequal digital access and skills.
- Clinical education and professional development programs should integrate Al training. Many clinicians are currently ill-equipped to advise patients on the Al products they are using. Professional organizations and health systems must provide robust, ongoing training on Al, algorithmic bias, data privacy, and the responsible integration of validated Al tools into clinical practice.
- Policymakers have to incentivize system-integrated innovation. They must create funding streams, reimbursement pathways, broaden insurance coverage, and clear regulatory pathways that encourage the development of Al tools designed to augment and improve the existing health care system. The promise of Al must not become an excuse to disinvest in our human health care workforce. It is fundamental to continue to fund and support programs that train, recruit, and retain mental health professionals, reduce administrative burdens, and ensure equitable access to care for all populations.
- Researchers can focus on identifying how and where Al can be most responsibly and effectively integrated into the mental health care pathway. Key questions to investigate include:
- Can Al tools safely and effectively support patients on long clinical waiting lists• What is the appropriate role for Al in crisis care, and what are its absolute limits?
- Which patient populations and clinical presentations respond well to Al-driven support, and which do not?
- How can Al be used to improve the quality of care or reduce provider response times when used as a tool under human supervision?
- Which populations are currently underserved by both traditional care and emerging Al tools, and how can their needs be met through responsible, high-quality approach that do not further.

