SRIBD News
Advancements in Smart Healthcare: Huatuo Intelligent Agent
The Huatuo, developed by the Shenzhen Research Institute of Big Data (SRIBD), deeply integrates the medical knowledge of the HuatuoGPT with the advanced reasoning paradigm of DeepSeek-R1. It has successfully completed full-link deployment testing based on the Ascend ecosystem. Leveraging over 400,000 high-quality data entries (50 billion tokens) and trained with the proprietary reinforcement learning algorithm ReMax, the Huatuo Agent delivers precise and efficient medical reasoning and diagnostic capabilities.
The core capability of artificial intelligence applications in the medical field lies in the synergy between deep knowledge processing and high-frequency reasoning demands. This is especially evident in complex scenarios such as disease diagnosis, personalized treatment, and clinical decision-making, where the need for deep reasoning capabilities is particularly prominent. This aligns closely with the strengths of DeepSeek-R1.
The DeepSeek-R1 reasoning model excels in knowledge understanding and decision-making in complex scenarios. Its performance indicators, thought processes, and learning capabilities are comparable to those of human experts, enabling it to effectively simulate expert decision-making and analytical thinking.
Through the DeepSeek-R1 paradigm, Huatuo Agent significantly optimizes the overall capabilities of large language models (LLM), while enhancing the agent's self-reflection and validation mechanisms to ensure the accuracy of diagnostic results.
Additionally, with the support of Huawei Ascend computing power, Huatuo Agent can efficiently process vast amounts of medical data, balancing the enhancement of reasoning capabilities with the efficiency of data processing, providing more reliable solutions for complex medical scenarios.
In comparison with key indicators from mainstream models such as DeepSeek-R1, Tongyi Qianwen Qwen 2.5, and Baichuan Intelligent Baichuan M1, Huatuo Agent shows superior performance, particularly in pre-diagnosis capabilities, achieving an overall score of 52.9.
Key comparison indicators include medical record accuracy (whether important clinical symptoms are addressed), interview guideline correctness (whether inductive questions are included), and medical record standardization (whether writing is standardized and summaries are reasonable), among others.
Under the DeepSeek paradigm, The Center for AI Large Foundation Models (AIM) of SRIBD has independently developed the ReMax reinforcement learning algorithm, which is based on an improvement of the REINFORCE framework to optimize model performance. The core innovation of this algorithm lies in the introduction of a new method for estimating the baseline value using greedy responses. Compared to the PPO algorithm proposed by OpenAI, ReMax demonstrates several significant advantages while maintaining comparable model performance:
1. Computational Resource Efficiency: ReMax reduces GPU memory usage during training by approximately 50%.
2. Training Efficiency: ReMax achieves nearly twice the acceleration in training speed.
3. Algorithm Usability: ReMax significantly reduces the number of hyperparameters that need to be adjusted, greatly lowering the cost of algorithm development and tuning.
It is worth noting that, compared to the GRPO algorithm proposed by DeepSeek, the GRPO algorithm relies on extensive sampling to estimate the baseline value. This method not only incurs high computational costs but also results in estimates that are subject to randomness and noise, leading to less stability in the training process when compared to the ReMax algorithm, which adopts a deterministic estimation method.
The Huatuo Agent Pre-Diagnosis Platform supports multi-round deep interactions, conducting in-depth analysis of patient inquiries and accurately grasping the patient's symptoms and conditions. Its vast medical database enables knowledge correlation, leveraging numerous case sources from top-tier hospitals to drive the connection of medical knowledge points, thus explaining or resolving clinical issues. The platform’s causal inference capability allows it to deduce potential causal relationships in disease progression or treatment outcomes, capturing key information and providing scientific support for doctors in formulating treatment plans.
With a solid foundation in large models and research capabilities, SRIBD is dedicated to deeply integrating large model intelligence with real-world healthcare and public welfare scenarios. The launch of the Huatuo Agent marks a significant step forward in the field of medical intelligence.
After multiple iterations and self-empowerment, we will continue to expand the full ecosystem development of the Huatuo Agent, which includes intelligent triage, pre-diagnosis, AI family doctor services, and other diversified smart healthcare solutions. Through AI research, we aim to drive the continuous evolution of medical capabilities, inject innovative vitality into the healthcare industry, and make greater contributions to human health.