diff --git a/.DS_Store b/.DS_Store new file mode 100644 index 0000000..9171b72 Binary files /dev/null and b/.DS_Store differ diff --git a/1-Introduction/README.md b/1-Introduction/README.md new file mode 100644 index 0000000..e69de29 diff --git a/1-Introduction/images/jetson-for-beginners- banner.jpg b/1-Introduction/images/jetson-for-beginners- banner.jpg new file mode 100644 index 0000000..6edcba2 Binary files /dev/null and b/1-Introduction/images/jetson-for-beginners- banner.jpg differ diff --git a/2-reComputer-Jetson-Platform-Overview/.DS_Store b/2-reComputer-Jetson-Platform-Overview/.DS_Store new file mode 100644 index 0000000..f3d92ec Binary files /dev/null and b/2-reComputer-Jetson-Platform-Overview/.DS_Store differ diff --git a/2-reComputer-Jetson-Platform-Overview/2.1-What-Is-reComputer/.DS_Store b/2-reComputer-Jetson-Platform-Overview/2.1-What-Is-reComputer/.DS_Store new file mode 100644 index 0000000..3836996 Binary files /dev/null and b/2-reComputer-Jetson-Platform-Overview/2.1-What-Is-reComputer/.DS_Store differ diff --git a/2-reComputer-Jetson-Platform-Overview/2.1-What-Is-reComputer/README.md b/2-reComputer-Jetson-Platform-Overview/2.1-What-Is-reComputer/README.md new file mode 100644 index 0000000..676bc98 --- /dev/null +++ b/2-reComputer-Jetson-Platform-Overview/2.1-What-Is-reComputer/README.md @@ -0,0 +1,49 @@ +# What is reComputer + +The reComputer Jetson series is a comprehensive selection for NVIDIA Jetson compatible carrier boards and full systems, covering full range of Jetson modules from Jetson Nano and Xavier NX, to the advanced Orin Nano, Orin NX, and AGX Orin. Designed for reliability and ease of use, this series empowers developers and businesses to bring AI to the edge. Our commitment is to simplify access to cutting-edge technology, accelerating the development and deployment of intelligent devices and applications. This enables the creation of AI models like YOLOv8, YOLOv10, CLIP, LLM, VLM, Whisper, and RAG to be easily deployed across industries, from vision AI in video analytics and robotics sensing, to human-like interactions in Generative AI area. + + +

+ Seeed-NVIDIA-Jetson-Family +

+ +Taking our TOP1 shinning star - reComputer J4012 Jetson Orin NX 16GB as an example. When you make an order on Seeed bazaar, you'll receive a package containing a reComputer Edge box, and a power adapter as 19V/5A(Barrel Jack 5.5/2.5mm) standard. + + +

+ computer-vision +

+ +### šŸ›ļø Get a reComputer Jetson Orin NX Device + +| **Device Model** | **Description** | **Link** | +|:----------------:|:--------------------------:|:------------------:| +| reComputer J4011| powered by NVIDIA Jetson Orin NX 8GB|[**Make an Order**](https://www.seeedstudio.com/reComputer-J4011-p-5585.html)| +| reComputer J4012| powered by NVIDIA Jetson Orin NX 16GB| [**Make an Order**](https://www.seeedstudio.com/reComputer-J4012-p-5586.html)| + +The reComputer J4012 edge device is composed with one reComputer J401 carrier board, one Jetson Orin NX 16GB module, one heatsink with fan, and one aluminum enclosure: + +### - Jetson Orin NX Module + +The NVIDIAĀ® Jetson Orinā„¢ NX delivers AI supercomputer performance in a compact system-on-module (SOM) smaller than a credit card. Powered by a low-power NVIDIA Orin SoC, it combines the NVIDIA Ampereā„¢ GPU architecture with advanced 64-bit processing, multi-function video and image capabilities, and Deep Learning Accelerators. + +With up to 100 INT8 TOPs for compute and 50 INT8 TOPs for deep learning, it supports running multiple neural networks and processing high-resolution sensor data simultaneously. The Jetson Orin NX offers a blend of performance and power efficiency with diverse I/Os, including high-speed CSI and PCIe, as well as low-speed I2Cs and GPIOs, ideal for embedded and edge computing devices where size, weight, and power are critical. + + +### - reComputer J401 Carrier Board + +The reComputer J401 carrier board is an advanced, open-source extension board for NVIDIA Jetson Orin Nano/ Orin Nano modules. It breaks out typical IO interfaces from the SoM, inlcuding 4x USB 3.2, HDMI 2.1, 2x CSI, 1x RJ45 for GbE, M.2 Key E for WiFi/BLE connection, M.2 Key M for storage expansion, CAN, and GPIO, Pre-installed JetPack 5.1.1 on 128GB NVMe SSD, and also supporting the latest JetPack 6 BSP , all for accelarating your solution to market in ease. + +

+ computer-vision +

+ +### - Heatsink with Fan + +This Aluminum Heatsink offers essential, active, physical, and strong fan air cooling function for reComputer J401 carrier board, to avoid overheating and throttling problems while operating tough computing tasks. It is designed for continuous deployment, and can be customized to air flow speed through PWM function. + + +### - Aluminum Enclosure + +The whole device with aluminum case is compact as 130mm x120mm x 58.5mm, which can be easily embedded into various scenarios from autonomous machine to industrial system supporting multiple installation modes such as desktop and wall mounting. We also provide comprehensive certifications such as ROHS, CE, FCC, KC, UKCA, ensuring your product ready-to-go for the market. + diff --git a/2-reComputer-Jetson-Platform-Overview/2.1-What-Is-reComputer/images/Seeed-NVIDIA-Jetson-Family.png b/2-reComputer-Jetson-Platform-Overview/2.1-What-Is-reComputer/images/Seeed-NVIDIA-Jetson-Family.png new file mode 100644 index 0000000..eb7077c Binary files /dev/null and b/2-reComputer-Jetson-Platform-Overview/2.1-What-Is-reComputer/images/Seeed-NVIDIA-Jetson-Family.png differ diff --git a/2-reComputer-Jetson-Platform-Overview/2.2-NVIDIA-Jetson-Module/README.md b/2-reComputer-Jetson-Platform-Overview/2.2-NVIDIA-Jetson-Module/README.md new file mode 100644 index 0000000..e69de29 diff --git a/2-reComputer-Jetson-Platform-Overview/README.md b/2-reComputer-Jetson-Platform-Overview/README.md new file mode 100644 index 0000000..f9841aa --- /dev/null +++ b/2-reComputer-Jetson-Platform-Overview/README.md @@ -0,0 +1,16 @@ +

+ Seeed-NVIDIA-Jetson-Family +

+ + + +## šŸ“š Table of reComputer Jetson Platform + +| **Chapter** | **Content** | +|:-----------:|:------------------------------------------------:| +| Module 2.1| [**What is reComputer**](./2.1-What-is-reComputer/README.md)| +| Module 2.2| [**NVIDIA Jetson Module**](./2.2-NVIDIA-Jetson-Module/README.md)| +| Module 2.3| [**Seeed Jetson Compatible Carrier Board**](./2.3-Seeed-Jetson-Compatible-Carrier-Board/README.md)| +| Module 2.4| [**Jetson Full System Series**](./2.4-Jetson-Full-System-Series/README.md)| +| Module 2.5| [**Accessory Support**](./2.5-Accessory-Support/README.md)| +| Module 2.6| [**Application**](./2.6-Application/README.md)| diff --git a/3-Basic-Tools-and-Getting-Started/3.10-Nomachine/README.md b/3-Basic-Tools-and-Getting-Started/3.10-Nomachine/README.md new file mode 100644 index 0000000..38259ef --- /dev/null +++ b/3-Basic-Tools-and-Getting-Started/3.10-Nomachine/README.md @@ -0,0 +1,114 @@ +# Use NoMachine to remotely connect to the Jetson desktop + +## Introduction + +​ NoMachine is a remote desktop software that allows users to securely access and control computers from anywhere. It provides fast and smooth performance by using the NX protocol, enabling tasks such as working on files, running applications, watching videos, or playing games on a remote machine as if sitting in front of it. NoMachine supports multiple platforms including Windows, macOS, Linux, and mobile devices, and is often used for remote work, technical support, and collaboration. + +![image-20250926160018342](/home/zibochen/data/wiki/reComputer-Jetson-for-Beginners/3-Basic-Tools-and-Getting-Started/3.10-Nomachine/images/image-20250926160018342.png) + + + +## Installation on PC + +**Step 1.**Installation Match the version of your PC system. + +[Nomachine](https://www.nomachine.com/) + +![image-20250926161736173](/home/zibochen/data/wiki/reComputer-Jetson-for-Beginners/3-Basic-Tools-and-Getting-Started/3.10-Nomachine/images/image-20250926161736173.png) + +> Note:Here we will use the ubantu pc host for the demonstration. + +https://download.nomachine.com/download/?id=1&platform=linux + +![image-20250926162027354](/home/zibochen/data/wiki/reComputer-Jetson-for-Beginners/3-Basic-Tools-and-Getting-Started/3.10-Nomachine/images/image-20250926162027354.png) + +**Step 2.** Install the client. + +```bash +#INSTALL +cd /usr +wget https://web9001.nomachine.com/download/9.1/Linux/nomachine_9.1.24_6_x86_64.tar.gz +sudo tar xvzf nomachine_9.1.24_6_x86_64.tar.gz +rm nomachine_9.1.24_6_x86_64.tar.gz +sudo /usr/NX/nxserver --install + +``` + +Other operations(optianal) + +```bash +#UPDATE +cd /usr +sudo tar xvzf __.tar.gz +sudo /usr/NX/nxserver --update +#UNINSTALL +sudo /usr/NX/scripts/setup/nxserver --uninstall +sudo rm -rf /usr/NX +``` + +## Installation on Jetson + +**Step 1.** Download and install + +```bash +wget https://web9001.nomachine.com/download/9.1/Arm/nomachine_9.1.24_6_arm64.deb +sudo dpkg -i nomachine_9.1.24_6_arm64.deb +``` + +After installation is completed, the following key information will be displayed: + +![image-20250926160530003](/home/zibochen/data/wiki/reComputer-Jetson-for-Beginners/3-Basic-Tools-and-Getting-Started/3.10-Nomachine/images/image-20250926160530003.png) + +If your Jetson is not connected to a display, additional configuration is required to use Remote Desktop properly. + +**Step 2.** Check jetson's ip address + +![image-20250926165250083](/home/zibochen/data/wiki/reComputer-Jetson-for-Beginners/3-Basic-Tools-and-Getting-Started/3.10-Nomachine/images/image-20250926165250083.png) + +**Step 3.** We need to stop the system's graphics display manager and let NoMachine use the built-in virtual display service: + +```bash +# 1. Stop the graphical display manager of the system +sudo systemctl disable gdm3 --now + +# 2. Restart the NoMachine service +sudo /etc/NX/nxserver --restart +``` + +![image-20250926161036069](/home/zibochen/data/wiki/reComputer-Jetson-for-Beginners/3-Basic-Tools-and-Getting-Started/3.10-Nomachine/images/image-20250926161036069.png)Jetson is not connected to a monitor. PC can display the jetson graphical interface normally by connecting a nomachine to jetson. + + + +> Note:If you want to connect a Jetson to a monitor to display images, you need to re-enable the gdm3 service.Run the following command: + +```bash +# Re-enable the gdm3 service to start automatically at boot +sudo systemctl enable gdm3 +# Start the gdm3 service immediately +sudo systemctl start gdm3 +``` + +## Connection + +Add->Add connection + +![image-20250926164129169](/home/zibochen/data/wiki/reComputer-Jetson-for-Beginners/3-Basic-Tools-and-Getting-Started/3.10-Nomachine/images/image-20250926164129169.png) + +Enter the name and the ip address of jetson + +![image-20250926164316983](/home/zibochen/data/wiki/reComputer-Jetson-for-Beginners/3-Basic-Tools-and-Getting-Started/3.10-Nomachine/images/image-20250926164316983.png) + +Connection + +![image-20250926164514138](/home/zibochen/data/wiki/reComputer-Jetson-for-Beginners/3-Basic-Tools-and-Getting-Started/3.10-Nomachine/images/image-20250926164514138.png) + +Enter your jetson username and password + +![image-20250926164548202](/home/zibochen/data/wiki/reComputer-Jetson-for-Beginners/3-Basic-Tools-and-Getting-Started/3.10-Nomachine/images/image-20250926164548202.png) + +You have successfully connected jetson. Now you can operate your jetson on your pc. + +![image-20250926164651900](/home/zibochen/data/wiki/reComputer-Jetson-for-Beginners/3-Basic-Tools-and-Getting-Started/3.10-Nomachine/images/image-20250926164651900.png) + +> Note:The above connection needs to be within a local area network. If the PC and jetson are not in the same local area network, the connection will fail! + diff --git a/3-Basic-Tools-and-Getting-Started/3.10-Nomachine/images/image-20250926160018342.png b/3-Basic-Tools-and-Getting-Started/3.10-Nomachine/images/image-20250926160018342.png new file mode 100644 index 0000000..694da2b Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.10-Nomachine/images/image-20250926160018342.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.10-Nomachine/images/image-20250926160530003.png b/3-Basic-Tools-and-Getting-Started/3.10-Nomachine/images/image-20250926160530003.png new file mode 100644 index 0000000..ff3fe11 Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.10-Nomachine/images/image-20250926160530003.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.10-Nomachine/images/image-20250926161036069.png b/3-Basic-Tools-and-Getting-Started/3.10-Nomachine/images/image-20250926161036069.png new file mode 100644 index 0000000..57ddb2b Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.10-Nomachine/images/image-20250926161036069.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.10-Nomachine/images/image-20250926161736173.png b/3-Basic-Tools-and-Getting-Started/3.10-Nomachine/images/image-20250926161736173.png new file mode 100644 index 0000000..25d20b2 Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.10-Nomachine/images/image-20250926161736173.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.10-Nomachine/images/image-20250926162027354.png b/3-Basic-Tools-and-Getting-Started/3.10-Nomachine/images/image-20250926162027354.png new file mode 100644 index 0000000..594c3fe Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.10-Nomachine/images/image-20250926162027354.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.10-Nomachine/images/image-20250926164129169.png b/3-Basic-Tools-and-Getting-Started/3.10-Nomachine/images/image-20250926164129169.png new file mode 100644 index 0000000..6e4228c Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.10-Nomachine/images/image-20250926164129169.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.10-Nomachine/images/image-20250926164316983.png b/3-Basic-Tools-and-Getting-Started/3.10-Nomachine/images/image-20250926164316983.png new file mode 100644 index 0000000..10602ae Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.10-Nomachine/images/image-20250926164316983.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.10-Nomachine/images/image-20250926164514138.png b/3-Basic-Tools-and-Getting-Started/3.10-Nomachine/images/image-20250926164514138.png new file mode 100644 index 0000000..cc321e8 Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.10-Nomachine/images/image-20250926164514138.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.10-Nomachine/images/image-20250926164548202.png b/3-Basic-Tools-and-Getting-Started/3.10-Nomachine/images/image-20250926164548202.png new file mode 100644 index 0000000..add031b Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.10-Nomachine/images/image-20250926164548202.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.10-Nomachine/images/image-20250926164651900.png b/3-Basic-Tools-and-Getting-Started/3.10-Nomachine/images/image-20250926164651900.png new file mode 100644 index 0000000..e7f2cfd Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.10-Nomachine/images/image-20250926164651900.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.10-Nomachine/images/image-20250926165250083.png b/3-Basic-Tools-and-Getting-Started/3.10-Nomachine/images/image-20250926165250083.png new file mode 100644 index 0000000..a77ae1b Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.10-Nomachine/images/image-20250926165250083.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.2-AI-and-ML/README.md b/3-Basic-Tools-and-Getting-Started/3.2-AI-and-ML/README.md index 0d97aaa..22529fa 100644 --- a/3-Basic-Tools-and-Getting-Started/3.2-AI-and-ML/README.md +++ b/3-Basic-Tools-and-Getting-Started/3.2-AI-and-ML/README.md @@ -1,3 +1,142 @@ +# AI and ML + +Artificial Intelligence (AI) and Machine Learning (ML) are rapidly transforming the world, and Nvidia Jetson devices have become the preferred tools for AI and ML developers due to their powerful performance and flexibility. This article will provide a detailed introduction to the basics of AI and ML, the relationship and differences between them, and their practical applications. Additionally, it will delve into why Nvidia Jetson devices are recommended for AI and ML deployment, explaining their advantages and prerequisites. + +

+ + python logo + +

+ + +## What is AI +Artificial Intelligence (AI) is a branch of computer science dedicated to creating systems capable of simulating and performing tasks that require human intelligence. AI encompasses a range of technologies from simple rule-based systems to complex neural networks, aiming to enable machines to learn, reason, perceive, and make decisions like humans. AI technologies include natural language processing, computer vision, robotics, and automation. Its applications are vast, from self-driving cars and smart assistants to medical diagnostic systems, AI is profoundly changing how we live and work. + +Modern AI is typically categorized into the following types: +- **Weak AI (Narrow AI)**: Focuses on specific tasks such as speech recognition or image classification, exhibiting highly specialized intelligence. +- **Strong AI (General AI)**: Possesses broad intelligence, capable of understanding, learning, and adapting to different tasks and environments, similar to human intelligence. + +AI systems process vast amounts of data using algorithms and models to extract patterns and regularities, enabling predictions and decision-making in new situations. Machine Learning (ML) and Deep Learning (DL) are primary methods of achieving AI. ML relies on data and statistical techniques, while DL is based on multi-layered neural networks. + +With the continuous growth of computing power and data availability, AI technology has made significant advancements, offering immense transformative potential across various industries. In the future, AI will continue to evolve, driving innovation and economic growth while also presenting new challenges and ethical issues that require ongoing exploration and solutions. + +

+ + python logo + +

+ +## What is Machine Learning +Machine Learning (ML) is a subfield of Artificial Intelligence (AI) focused on developing algorithms and models that enable computer systems to learn and improve performance from data. Unlike traditional programming methods, machine learning analyzes large datasets to extract patterns and regularities, automatically generating decision rules and predictive models. This approach allows systems to perform tasks such as prediction, classification, and recognition based on data without explicit programming. + +The core of machine learning includes three main categories: +- **Supervised Learning**: Involves training models using labeled datasets, allowing the system to learn from known input-output mappings. Common applications include image recognition, speech recognition, and predictive analytics. +- **Unsupervised Learning**: Involves training with unlabeled data, where the model learns by discovering patterns and structures within the data. Common applications include clustering analysis and dimensionality reduction. +- **Reinforcement Learning**: Involves learning optimal strategies through interaction with the environment and receiving feedback. It is commonly used in areas such as robotic control and game AI. + +The success of machine learning depends on large amounts of high-quality data, powerful computing capabilities, and effective algorithm design. Through continuous iteration and optimization, machine learning systems can excel in handling complex tasks and are increasingly being widely applied across various industries. + +

+ + python logo + +

+ +## The Relationship and Difference Between AI and Machine Learning +Artificial Intelligence (AI) and Machine Learning (ML) are closely related but have distinct definitions and application areas. AI is a broad field aimed at creating systems capable of simulating human intelligence behaviors. These systems can perceive, reason, learn, and make decisions to solve complex problems. Machine Learning, on the other hand, is a primary method for achieving AI, focused on building intelligent systems through learning and improving performance from data. + +Specifically, Machine Learning is a subset of AI that concentrates on developing algorithms and models that enable computer systems to learn from experience without explicit programming instructions. This means that Machine Learning is a crucial pathway for realizing and enhancing the intelligence of AI systems. By utilizing large-scale data and advanced algorithms, Machine Learning allows AI systems to handle complex tasks such as image recognition, speech recognition, and natural language processing. + +The relationship between the two can be understood as follows: AI is an overarching framework that encompasses various technologies and methods, while Machine Learning is a key technique within this framework for achieving intelligence. As technology advances, Machine Learning plays an increasingly vital role in driving the development and application of AI. + +## Applications of AI and machine learning +Artificial Intelligence (AI) and Machine Learning (ML) have extensive applications across various industries, transforming the way we work and live. + +1. **Healthcare**: AI and ML are used for diagnosing diseases, predicting patient risks, personalizing treatment plans, and drug development. For example, AI can analyze medical images to detect early-stage cancer, and ML algorithms can identify potential health issues from electronic health records. + +2. **Financial Services**: In finance, AI and ML are applied to risk management, fraud detection, automated trading, and customer service. AI systems can analyze market data in real time to provide investment advice, while ML models can detect unusual trading behavior to prevent fraud. + +3. **Manufacturing**: AI and ML are used in manufacturing for predictive maintenance, quality control, and production optimization. By analyzing sensor data, AI can predict equipment failures, reducing downtime and increasing production efficiency. + +4. **Retail**: Retailers use AI and ML for demand forecasting, personalized recommendations, and inventory management. AI algorithms can analyze consumer behavior to provide personalized product recommendations, enhancing customer satisfaction. + +5. **Transportation**: Self-driving cars are a significant application of AI and ML. Using computer vision and deep learning algorithms, autonomous driving systems can perceive the environment and make driving decisions. Additionally, AI is used to optimize logistics and transportation routes, improving efficiency. + +6. **Customer Service**: Chatbots and virtual assistants are typical applications of AI and ML in customer service. They can handle common inquiries and provide 24/7 support, greatly enhancing the customer experience. + +7. **Entertainment and Media**: AI and ML play crucial roles in content recommendation, image and video recognition, and game development. Streaming platforms use ML algorithms to recommend movies and music that users might like, while AI can generate realistic game scenes and characters. + +The rapid development of AI and ML technologies drives digital transformation across industries, enhancing efficiency and innovation capabilities. In the future, as these technologies continue to mature, AI and ML will demonstrate their immense potential in even more fields. + +

+ + python logo + +

+ +## What is the Nvidia Jetson Device +NVIDIA Jetson is a key player in the field of Artificial Intelligence (AI), representing a significant shift towards edge computing and deep learning. As a micro AI computer, it brings the powerful capabilities of modern AI into small, cost-effective, high-performance, and low-power devices, enabling enthusiasts, learners, and developers alike to easily utilize them. Developers can use Jetson devices to create innovative AI products across multiple industries. + +What sets NVIDIA Jetson apart is its integration with NVIDIA JetPack, a comprehensive suite that includes the JetPack SDK, libraries, APIs, and tools for developing AI applications. This SDK is crucial as it aligns with NVIDIA's vision of democratizing AI development, ensuring that Jetson users have access to resources used throughout the entire NVIDIA ecosystem. The accessibility of these tools fosters a community of developers and learners who can innovate in fields such as computer vision, robotics, and speech processing without requiring substantial financial investment. + +## Why Jetson +- **Modular Flexibility**: Whether for small businesses or large enterprises, the NVIDIA Jetson product line offers modules suited for every type of business. You can choose from a range of modules that are ideal for developing everything from entry-level AI applications to advanced, complex machines. + +- **Unified Software**: NVIDIA Jetson supports a unified software architecture that simplifies the work of software developers. This unified approach eliminates the need for redundant coding when enhancing creations on different Jetson modules. The NVIDIA JetPack SDK includes the Linux operating system (OS), CUDA-accelerated libraries, and APIs for various machine learning domains, including deep learning and computer vision. It also supports machine learning frameworks like TensorFlow, PyTorch, Keras, and computer vision libraries such as OpenCV. + +- **Support for Cloud-Native Technologies**: By supporting cloud-native technologies and workflows, such as orchestration and containerization, the NVIDIA Jetson platform provides developers with the flexibility to quickly develop or upgrade AI products. + +- **Wide Application Scenarios**: + - **Robotics**: Extensively used for object detection, navigation, and manipulation tasks in robotics. + - **Smart Cities**: Applied in areas such as traffic monitoring, security surveillance, and infrastructure management. + - **Healthcare**: Supports medical imaging and diagnostic applications. + - **Edge AI**: Ideal for deploying AI models in resource-constrained environments for real-time inference. + +## High-Performance NVIDIA Jetson Devices + +

+ + python logo + +

+ + +| **Device Module** | **Description** | **Link** | +|:---------:|:---------:|:---------:| +| Jetson Orin Nano Dev Kit, Orin Nano 8GB, 40TOPS | Developer kit for NVIDIA Jetson Orin Nano | [Buy Here](https://www.seeedstudio.com/NVIDIAr-Jetson-Orintm-Nano-Developer-Kit-p-5617.html) | +| reComputer J4012, powered by Orin NX 16GB, 100 TOPS | Embedded computer powered by Orin NX| [Buy Here](https://www.seeedstudio.com/reComputer-J4012-p-5586.html) | +| reComputer J4011, powered by Orin NX 8GB, 70 TOPS | Embedded computer powered by Orin NX | [Buy Here](https://www.seeedstudio.com/reComputer-J4011-p-5585.html) | +| reComputer J3011, powered by Orin nano 8GB, 40 TOPS | Embedded computer powered by Orin Nano | [Buy Here](https://www.seeedstudio.com/reComputer-J3011-p-5590.html) | + | reComputer J3011, powered by Orin nano 4GB, 20 TOPS | Embedded computer powered by Orin Nano | [Buy Here](https://www.seeedstudio.com/reComputer-J3010-p-5589.html) | + +## Essential Knowledge and Prerequisites + +To effectively work with NVIDIA Jetson devices and leverage their capabilities for AI and deep learning applications, you should have a solid foundation in the following areas: + +1. **Mathematics Fundamentals**: + - **Linear Algebra**: Key for understanding neural networks and optimizing algorithms through matrix and vector operations. + - **Probability and Statistics**: Essential for model evaluation, hypothesis testing, and data analysis. + - **Calculus**: Crucial for understanding gradient descent and optimization processes in algorithms. + +2. **Programming Skills**: + - **Python**: The most commonly used language for AI and machine learning development. + - **Libraries and Frameworks**: Familiarity with libraries and frameworks such as TensorFlow, PyTorch, CUDA, JetPack SDK, TensorRT, and DeepStream is important for developing AI applications. + +3. **Computer Science Fundamentals**: + - **Data Structures and Algorithms**: Enhances code efficiency and helps in understanding algorithm complexity. + - **Database Knowledge**: Essential for handling and managing large volumes of data. + +4. **Machine Learning and AI Basics**: + - **Supervised and Unsupervised Learning**: Understanding different learning methods and their applications. + - **Deep Learning**: Knowledge of neural networks, Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), and Transformer models. + +5. **Hardware Knowledge**: + - **Embedded Systems**: Understanding the principles of embedded devices and their operation. + - **Hardware Specifications**: Familiarity with Jetson device parameters, including CPU, GPU, and memory specifications. + +6. **System and Network Understanding**: + - **Linux Systems**: Jetson devices typically run Ubuntu or other Linux distributions, so knowledge of Linux commands and system administration is crucial. + +Meeting these prerequisites will help you effectively develop and deploy AI applications on NVIDIA Jetson devices and make the most of their powerful capabilities. -pass diff --git a/3-Basic-Tools-and-Getting-Started/3.2-AI-and-ML/images/fig1.png b/3-Basic-Tools-and-Getting-Started/3.2-AI-and-ML/images/fig1.png new file mode 100644 index 0000000..b764b4e Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.2-AI-and-ML/images/fig1.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.2-AI-and-ML/images/fig2.png b/3-Basic-Tools-and-Getting-Started/3.2-AI-and-ML/images/fig2.png new file mode 100644 index 0000000..31cd87b Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.2-AI-and-ML/images/fig2.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.2-AI-and-ML/images/fig3.png b/3-Basic-Tools-and-Getting-Started/3.2-AI-and-ML/images/fig3.png new file mode 100644 index 0000000..5f56d7f Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.2-AI-and-ML/images/fig3.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.2-AI-and-ML/images/fig4.png b/3-Basic-Tools-and-Getting-Started/3.2-AI-and-ML/images/fig4.png new file mode 100644 index 0000000..924e3d6 Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.2-AI-and-ML/images/fig4.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.2-AI-and-ML/images/fig5.png b/3-Basic-Tools-and-Getting-Started/3.2-AI-and-ML/images/fig5.png new file mode 100644 index 0000000..89a8948 Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.2-AI-and-ML/images/fig5.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.3-CUDA/README.md b/3-Basic-Tools-and-Getting-Started/3.3-CUDA/README.md new file mode 100644 index 0000000..4553e10 --- /dev/null +++ b/3-Basic-Tools-and-Getting-Started/3.3-CUDA/README.md @@ -0,0 +1,162 @@ + +# Nvidia CUDA + +NVIDIA CUDA (Compute Unified Device Architecture) is a parallel computing platform and programming model developed by NVIDIA. CUDA enables developers to use high-performance NVIDIA GPUs (Graphics Processing Units) to perform general-purpose computing tasks. Through CUDA, developers can leverage the powerful computing capabilities of GPUs for large-scale parallel computations, significantly improving application performance, especially in fields such as scientific computing, machine learning, image processing, and physical simulation. CUDA supports multiple programming languages, such as C, C++, and Python, and provides a rich set of libraries and tools to help developers easily achieve GPU-accelerated computing. + +

+ + cuda logo + +

+ +## Basics of CUDA Programming + + + +How can we call GPU resources in a program to accelerate code execution? By using CUDA, we can easily manage thousands of computing cores within the GPU. The CUDA programming model is a heterogeneous model that requires the CPU and GPU to work together. In CUDA programming, the terms "host" and "device" are used to distinguish between the devices where the code is executed. + +> - Host: CPU and host memory +> - Device: GPU and video memory + +Generally, the main program loads on the CPU, then copies the initialization data to the GPU. The GPU completes the computations and copies the results back to the CPU. + + + +We can use a kernel in a CUDA program to implement the above process. In CUDA, a `kernel` refers to a function that, when called, causes the GPU to launch many threads simultaneously to execute this kernel, thereby achieving parallelism. Each thread executes the kernel using its thread ID to correspond to the index of the input data, ensuring that every thread executes the same kernel but processes different data. + +The GPU has a vast number of computing cores, and CUDA uses a two-level organizational structure to manage these cores. Here are three concepts in CUDA programming that need to be introduced: Thread, Block, and Grid. + +- Thread: A CUDA parallel program is executed by many threads. +- Block: Several threads are grouped together to form a block. +- Grid: Multiple blocks are then organized into a grid. + +> Note: Threads within the same block can synchronize and communicate through shared memory. + +

+ cpu and gpu +

+ +In CUDA, each thread has a unique identifier called ThreadIdx, which changes depending on the way the Grid and Block are divided. + +Additionally, here is a brief introduction to the CUDA memory model, as shown in the diagram below. Each thread has its own private local memory, while each thread block includes shared memory that can be accessed by all threads within the block, with a lifespan equal to that of the thread block. Furthermore, all threads can access global memory. There are also some read-only memory spaces: constant memory and texture memory. The memory structure involves program optimization, which will not be discussed in detail here. + +

+ cpu and gpu +

+ +## CUDA Demonstration Case + +Now, let's experience GPU acceleration through a simple example. In this example, we will compute the sum of two vectors on both the CPU and GPU, and compare the computation speeds between CPU and GPU. + +### Preparing the Runtime Environment + +The JetPack operating system already includes the CUDA runtime environment. We can check the CUDA version using the jtop tool. Open the terminal and run jtop. + +```bash +jtop +``` +

+ jtop +

+ +> Note: If jtop is not installed on your system, please refer to [this guide](https://github.com/rbonghi/jetson_stats). + +In the following, we will use Python to write CUDA code. So we need to run the following command in the terminal to install `Numba`. + +```bash +pip install numba +``` + +### Running the Example Code + +**Step 1:** Create a Python script for this course in the root directory: + +```bash +mkdir -p ~/reComputer_J/3.4_CUDA +touch ~/reComputer_J/3.4_CUDA/add_compare.py +``` + +**Step 2:** Open VSCode in the directory where the script is located: + +```bash +cd ~/reComputer_J/3.4_CUDA +code . +``` + +**Step 3:** In the left navigation bar, we can see `hello_world.py`. Enter the following code in the python file: + +
+ Click to expand code + + ```python +from numba import cuda, float32 +import numpy as np +import time + + +@cuda.jit +def add_gpu(a, b, out): + tx = cuda.threadIdx.x + ty = cuda.blockIdx.x + block_size = cuda.blockDim.x + grid_size = cuda.gridDim.x + start = tx + ty * block_size + stride = block_size * grid_size + + for i in range(start, a.shape[0], stride): + out[i] = a[i] + b[i] + +def add_cpu(a, b, out): + for i in range(a.shape[0]): + out[i] = a[i] + b[i] + +def main(): + # Prepare test data + n = 10000000 + a = np.arange(n).astype(np.float32) + b = np.arange(n).astype(np.float32) + out = np.empty_like(a) + + # 1. Test CPU computation time + start = time.time() + add_cpu(a, b, out) + print("CPU Time Taken:", time.time()-start) + + # 2. Test GPU computation time + # Copy data from host memory to device memory + d_a = cuda.to_device(a) + d_b = cuda.to_device(b) + d_out = cuda.device_array_like(out) + # Define the number of threads per block and the number of blocks + threads_per_block = 256 + blocks_per_grid = (n + threads_per_block - 1) // threads_per_block + # Compute the result + start_gpu = time.time() + add_gpu[blocks_per_grid, threads_per_block](d_a, d_b, d_out) + end_gpu = time.time() + # Copy the result from device memory back to host memory + d_out.copy_to_host(out) + + print("GPU Time Taken:", end_gpu-start_gpu) + + +if __name__ == "__main__": + main() + + ``` +
+ +**Step 4:** Click the run button to see the result of the Python script in the terminal. + +![launch vscode](./images/vscode.png) + +From the results printed in the terminal, it can be observed that the GPU's computation speed is faster. + +## More CUDA Learning Tutorials + +| **Tutorial** | **Type** | **Description** | +|:---------:|:---------:|:---------:| +| [Getting Started with CUDA Programming(zh)](https://zhuanlan.zhihu.com/p/34587739) | doc | A Minimalist Tutorial for Getting Started with CUDA Programming by Chinese | +| [CUDA C++ Programming Guide](https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html) | doc | The programming guide to the CUDA model and interface. | diff --git a/3-Basic-Tools-and-Getting-Started/3.3-CUDA/images/Nvidia_CUDA.png b/3-Basic-Tools-and-Getting-Started/3.3-CUDA/images/Nvidia_CUDA.png new file mode 100644 index 0000000..f37c99b Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.3-CUDA/images/Nvidia_CUDA.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.3-CUDA/images/cpu_and_gpu.png b/3-Basic-Tools-and-Getting-Started/3.3-CUDA/images/cpu_and_gpu.png new file mode 100644 index 0000000..3b95b0d Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.3-CUDA/images/cpu_and_gpu.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.3-CUDA/images/cuda_memory.png b/3-Basic-Tools-and-Getting-Started/3.3-CUDA/images/cuda_memory.png new file mode 100644 index 0000000..26251de Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.3-CUDA/images/cuda_memory.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.3-CUDA/images/jtop.png b/3-Basic-Tools-and-Getting-Started/3.3-CUDA/images/jtop.png new file mode 100644 index 0000000..46afbda Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.3-CUDA/images/jtop.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.3-CUDA/images/thread_block_grid.png b/3-Basic-Tools-and-Getting-Started/3.3-CUDA/images/thread_block_grid.png new file mode 100644 index 0000000..ab482bb Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.3-CUDA/images/thread_block_grid.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.3-CUDA/images/vscode.png b/3-Basic-Tools-and-Getting-Started/3.3-CUDA/images/vscode.png new file mode 100644 index 0000000..5f631f9 Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.3-CUDA/images/vscode.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.3-CUDA/scripts/add_compare.py b/3-Basic-Tools-and-Getting-Started/3.3-CUDA/scripts/add_compare.py new file mode 100644 index 0000000..ef68b4d --- /dev/null +++ b/3-Basic-Tools-and-Getting-Started/3.3-CUDA/scripts/add_compare.py @@ -0,0 +1,53 @@ +from numba import cuda, float32 +import numpy as np +import time + + +@cuda.jit +def add_gpu(a, b, out): + tx = cuda.threadIdx.x + ty = cuda.blockIdx.x + block_size = cuda.blockDim.x + grid_size = cuda.gridDim.x + start = tx + ty * block_size + stride = block_size * grid_size + + for i in range(start, a.shape[0], stride): + out[i] = a[i] + b[i] + +def add_cpu(a, b, out): + for i in range(a.shape[0]): + out[i] = a[i] + b[i] + +def main(): + # Prepare test data + n = 10000000 + a = np.arange(n).astype(np.float32) + b = np.arange(n).astype(np.float32) + out = np.empty_like(a) + + # 1. Test CPU computation time + start = time.time() + add_cpu(a, b, out) + print("CPU Time Taken:", time.time()-start) + + # 2. Test GPU computation time + # Copy data from host memory to device memory + d_a = cuda.to_device(a) + d_b = cuda.to_device(b) + d_out = cuda.device_array_like(out) + # Define the number of threads per block and the number of blocks + threads_per_block = 256 + blocks_per_grid = (n + threads_per_block - 1) // threads_per_block + # Compute the result + start_gpu = time.time() + add_gpu[blocks_per_grid, threads_per_block](d_a, d_b, d_out) + end_gpu = time.time() + # Copy the result from device memory back to host memory + d_out.copy_to_host(out) + + print("GPU Time Taken:", end_gpu-start_gpu) + + +if __name__ == "__main__": + main() diff --git a/3-Basic-Tools-and-Getting-Started/3.4-TensorRT/README.md b/3-Basic-Tools-and-Getting-Started/3.4-TensorRT/README.md new file mode 100644 index 0000000..55282e7 --- /dev/null +++ b/3-Basic-Tools-and-Getting-Started/3.4-TensorRT/README.md @@ -0,0 +1,67 @@ + +# Nvidia CUDA + +NVIDIAĀ® TensorRTā„¢ is an ecosystem of APIs for high-performance deep learning inference. TensorRT includes an inference runtime and model optimizations that deliver low latency and high throughput for production applications. The TensorRT ecosystem includes TensorRT, TensorRT-LLM, TensorRT Model Optimizer, and TensorRT Cloud. + +

+ + tesorrt + +

+ +## "Hello World" for TensorRT + +In this sample, we will create a neural network with classification capabilities based on PyTorch to implement handwritten digit recognition. The model will be trained and tested on the [MNIST](http://yann.lecun.com/exdb/mnist/) dataset. Finally, the model will be converted to TensorRT format to accelerate the inference speed of the neural network. + +This is a test case from Nvidia. Here we attempt to deploy and run it on a Jetson device. + +> Note: For more information, please refer to: +> - https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#network_python +> - https://github.com/NVIDIA/TensorRT/tree/main/samples/python/network_api_pytorch_mnist + +### Prepare the run environment + +#### TensorRT + +Jetpack has TensorRT pre-installed. + +Please run the following command in the terminal. If the TensorRT version is printed in the terminal, it means that TensorRT is installed correctly. + +```bash +sudo apt install nvidia-jetpack +python3 -c "import tensorrt as trt; print(f'TensorRT version: {trt.__version__}')" +``` + +#### Pytorch and Torchvision + +Please refer to [Module 3.3](../3.3-Pytorch-and-Tensorflow/README.md) for the installation of PyTorch and Torchvision. I have installed version 2.0.0 of torch. + +![pytorch-and-torchvision](./images/pytorch-torchvision.png) + +### Running the Sample + + +Download the Python scripts from the [`scripts` folder](./scripts/) and copy them to the Jetson device. + +Open the terminal and run: + +```bash +cd +python3 sample.py +``` + +If the sample runs successfully you should see a match between the test case and the prediction. + +```bash +Test Case: 4 +Prediction: 4 +``` + +> Note: Please ignore any warning messages that appear during the execution of the program. + + +## Learn More + +| **Tutorial** | **Type** | **Description** | +|:---------:|:---------:|:---------:| +| [TensorRT Getting Started](https://developer.nvidia.com/tensorrt-getting-started) | website | Nvidia's official getting started tutorial | diff --git a/3-Basic-Tools-and-Getting-Started/3.4-TensorRT/images/TensorRT.png b/3-Basic-Tools-and-Getting-Started/3.4-TensorRT/images/TensorRT.png new file mode 100644 index 0000000..f8b76f5 Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.4-TensorRT/images/TensorRT.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.4-TensorRT/images/pytorch-torchvision.png b/3-Basic-Tools-and-Getting-Started/3.4-TensorRT/images/pytorch-torchvision.png new file mode 100644 index 0000000..146fac7 Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.4-TensorRT/images/pytorch-torchvision.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.4-TensorRT/scripts/common.py b/3-Basic-Tools-and-Getting-Started/3.4-TensorRT/scripts/common.py new file mode 100644 index 0000000..4dc819a --- /dev/null +++ b/3-Basic-Tools-and-Getting-Started/3.4-TensorRT/scripts/common.py @@ -0,0 +1,192 @@ +# +# SPDX-FileCopyrightText: Copyright (c) 1993-2022 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + +import argparse +import os + +import numpy as np + +# Use autoprimaryctx if available (pycuda >= 2021.1) to +# prevent issues with other modules that rely on the primary +# device context. +try: + import pycuda.autoprimaryctx +except ModuleNotFoundError: + import pycuda.autoinit + +import pycuda.driver as cuda +import tensorrt as trt + +try: + # Sometimes python does not understand FileNotFoundError + FileNotFoundError +except NameError: + FileNotFoundError = IOError + +EXPLICIT_BATCH = 1 << (int)(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH) + + +def GiB(val): + return val * 1 << 30 + + +def add_help(description): + parser = argparse.ArgumentParser(description=description, formatter_class=argparse.ArgumentDefaultsHelpFormatter) + args, _ = parser.parse_known_args() + + +def find_sample_data(description="Runs a TensorRT Python sample", subfolder="", find_files=[], err_msg=""): + """ + Parses sample arguments. + + Args: + description (str): Description of the sample. + subfolder (str): The subfolder containing data relevant to this sample + find_files (str): A list of filenames to find. Each filename will be replaced with an absolute path. + + Returns: + str: Path of data directory. + """ + + # Standard command-line arguments for all samples. + kDEFAULT_DATA_ROOT = os.path.join(os.sep, "usr", "src", "tensorrt", "data") + parser = argparse.ArgumentParser(description=description, formatter_class=argparse.ArgumentDefaultsHelpFormatter) + parser.add_argument( + "-d", + "--datadir", + help="Location of the TensorRT sample data directory, and any additional data directories.", + action="append", + default=[kDEFAULT_DATA_ROOT], + ) + args, _ = parser.parse_known_args() + + def get_data_path(data_dir): + # If the subfolder exists, append it to the path, otherwise use the provided path as-is. + data_path = os.path.join(data_dir, subfolder) + if not os.path.exists(data_path): + if data_dir != kDEFAULT_DATA_ROOT: + print("WARNING: " + data_path + " does not exist. Trying " + data_dir + " instead.") + data_path = data_dir + # Make sure data directory exists. + if not (os.path.exists(data_path)) and data_dir != kDEFAULT_DATA_ROOT: + print( + "WARNING: {:} does not exist. Please provide the correct data path with the -d option.".format( + data_path + ) + ) + return data_path + + data_paths = [get_data_path(data_dir) for data_dir in args.datadir] + return data_paths, locate_files(data_paths, find_files, err_msg) + + +def locate_files(data_paths, filenames, err_msg=""): + """ + Locates the specified files in the specified data directories. + If a file exists in multiple data directories, the first directory is used. + + Args: + data_paths (List[str]): The data directories. + filename (List[str]): The names of the files to find. + + Returns: + List[str]: The absolute paths of the files. + + Raises: + FileNotFoundError if a file could not be located. + """ + found_files = [None] * len(filenames) + for data_path in data_paths: + # Find all requested files. + for index, (found, filename) in enumerate(zip(found_files, filenames)): + if not found: + file_path = os.path.abspath(os.path.join(data_path, filename)) + if os.path.exists(file_path): + found_files[index] = file_path + + # Check that all files were found + for f, filename in zip(found_files, filenames): + if not f or not os.path.exists(f): + raise FileNotFoundError( + "Could not find {:}. Searched in data paths: {:}\n{:}".format(filename, data_paths, err_msg) + ) + return found_files + + +# Simple helper data class that's a little nicer to use than a 2-tuple. +class HostDeviceMem(object): + def __init__(self, host_mem, device_mem): + self.host = host_mem + self.device = device_mem + + def __str__(self): + return "Host:\n" + str(self.host) + "\nDevice:\n" + str(self.device) + + def __repr__(self): + return self.__str__() + + +# Allocates all buffers required for an engine, i.e. host/device inputs/outputs. +def allocate_buffers(engine): + inputs = [] + outputs = [] + bindings = [] + stream = cuda.Stream() + for binding in engine: + size = trt.volume(engine.get_binding_shape(binding)) * engine.max_batch_size + dtype = trt.nptype(engine.get_binding_dtype(binding)) + # Allocate host and device buffers + host_mem = cuda.pagelocked_empty(size, dtype) + device_mem = cuda.mem_alloc(host_mem.nbytes) + # Append the device buffer to device bindings. + bindings.append(int(device_mem)) + # Append to the appropriate list. + if engine.binding_is_input(binding): + inputs.append(HostDeviceMem(host_mem, device_mem)) + else: + outputs.append(HostDeviceMem(host_mem, device_mem)) + return inputs, outputs, bindings, stream + + +# This function is generalized for multiple inputs/outputs. +# inputs and outputs are expected to be lists of HostDeviceMem objects. +def do_inference(context, bindings, inputs, outputs, stream, batch_size=1): + # Transfer input data to the GPU. + [cuda.memcpy_htod_async(inp.device, inp.host, stream) for inp in inputs] + # Run inference. + context.execute_async(batch_size=batch_size, bindings=bindings, stream_handle=stream.handle) + # Transfer predictions back from the GPU. + [cuda.memcpy_dtoh_async(out.host, out.device, stream) for out in outputs] + # Synchronize the stream + stream.synchronize() + # Return only the host outputs. + return [out.host for out in outputs] + + +# This function is generalized for multiple inputs/outputs for full dimension networks. +# inputs and outputs are expected to be lists of HostDeviceMem objects. +def do_inference_v2(context, bindings, inputs, outputs, stream): + # Transfer input data to the GPU. + [cuda.memcpy_htod_async(inp.device, inp.host, stream) for inp in inputs] + # Run inference. + context.execute_async_v2(bindings=bindings, stream_handle=stream.handle) + # Transfer predictions back from the GPU. + [cuda.memcpy_dtoh_async(out.host, out.device, stream) for out in outputs] + # Synchronize the stream + stream.synchronize() + # Return only the host outputs. + return [out.host for out in outputs] \ No newline at end of file diff --git a/3-Basic-Tools-and-Getting-Started/3.4-TensorRT/scripts/downloader.py b/3-Basic-Tools-and-Getting-Started/3.4-TensorRT/scripts/downloader.py new file mode 100644 index 0000000..fbb2e81 --- /dev/null +++ b/3-Basic-Tools-and-Getting-Started/3.4-TensorRT/scripts/downloader.py @@ -0,0 +1,227 @@ +# +# SPDX-FileCopyrightText: Copyright (c) 1993-2022 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + +import argparse +import errno +import hashlib +import logging +import os +import sys + + +logger = logging.getLogger("downloader") + + +class DataFile: + """Holder of a data file.""" + + def __init__(self, attr): + self.attr = attr + self.path = attr["path"] + self.url = attr["url"] + if "checksum" not in attr: + logger.warning("Checksum of %s not provided!", self.path) + self.checksum = attr.get("checksum", None) + + def __str__(self): + return str(self.attr) + + +class SampleData: + """Holder of data files of an sample.""" + + def __init__(self, attr): + self.attr = attr + self.sample = attr["sample"] + files = attr.get("files", None) + self.files = [DataFile(f) for f in files] + + def __str__(self): + return str(self.attr) + + +def _loadYAML(yaml_path): + with open(yaml_path, "rb") as f: + import yaml + + y = yaml.load(f, yaml.FullLoader) + return SampleData(y) + + +def _checkMD5(path, refMD5): + md5 = hashlib.md5(open(path, "rb").read()).hexdigest() + return md5 == refMD5 + + +def _createDirIfNeeded(path): + the_dir = os.path.dirname(path) + try: + os.makedirs(the_dir) + except OSError as e: + if e.errno != errno.EEXIST: + raise + + +def download(data_dir, yaml_path, overwrite=False): + """Download the data files specified in YAML file to a directory. + + Return false if the downloaded file or the local copy (if not overwrite) has a different checksum. + """ + sample_data = _loadYAML(yaml_path) + logger.info("Downloading data for %s", sample_data.sample) + + def _downloadFile(path, url): + logger.info("Downloading %s from %s", path, url) + import requests + + r = requests.get(url, stream=True, timeout=5) + size = int(r.headers.get("content-length", 0)) + from tqdm import tqdm + + progress_bar = tqdm(total=size, unit="iB", unit_scale=True) + with open(path, "wb") as fd: + for chunk in r.iter_content(chunk_size=1024): + progress_bar.update(len(chunk)) + fd.write(chunk) + progress_bar.close() + + allGood = True + for f in sample_data.files: + fpath = os.path.join(data_dir, f.path) + if os.path.exists(fpath): + if _checkMD5(fpath, f.checksum): + logger.info("Found local copy %s, skip downloading.", fpath) + continue + else: + logger.warning("Local copy %s has a different checksum!", fpath) + if overwrite: + logging.warning("Removing local copy %s", fpath) + os.remove(fpath) + else: + allGood = False + continue + _createDirIfNeeded(fpath) + _downloadFile(fpath, f.url) + if not _checkMD5(fpath, f.checksum): + logger.error("The downloaded file %s has a different checksum!", fpath) + allGood = False + + return allGood + + +def _parseArgs(): + parser = argparse.ArgumentParser(description="Downloader of TensorRT sample data files.") + parser.add_argument( + "-d", + "--data", + help="Specify the data directory, data will be downloaded to there. $TRT_DATA_DIR will be overwritten by this argument.", + ) + parser.add_argument( + "-f", + "--file", + help="Specify the path to the download.yml, default to `download.yml` in the working directory", + default="download.yml", + ) + parser.add_argument( + "-o", "--overwrite", help="Force to overwrite if MD5 check failed", action="store_true", default=False + ) + parser.add_argument( + "-v", + "--verify", + help="Verify if the data has been downloaded. Will not download if specified.", + action="store_true", + default=False, + ) + + args, _ = parser.parse_known_args() + data = os.environ.get("TRT_DATA_DIR", None) if args.data is None else args.data + if data is None: + raise ValueError("Data directory must be specified by either `-d $DATA` or environment variable $TRT_DATA_DIR.") + + return data, args + + +def verifyChecksum(data_dir, yaml_path): + """Verify the checksum of the files described by the YAML. + + Return false of any of the file doesn't existed or checksum is different with the YAML. + """ + sample_data = _loadYAML(yaml_path) + logger.info("Verifying data files and their MD5 for %s", sample_data.sample) + + allGood = True + for f in sample_data.files: + fpath = os.path.join(data_dir, f.path) + if os.path.exists(fpath): + if _checkMD5(fpath, f.checksum): + logger.info("MD5 match for local copy %s", fpath) + else: + logger.error("Local file %s has a different checksum!", fpath) + allGood = False + else: + allGood = False + logger.error("Data file %s doesn't have a local copy", f.path) + + return allGood + + +def main(): + data, args = _parseArgs() + logging.basicConfig() + logger.setLevel(logging.INFO) + + ret = True + if args.verify: + ret = verifyChecksum(data, args.file) + else: + ret = download(data, args.file, args.overwrite) + + if not ret: + # Error of downloading or checksum + sys.exit(1) + + +if __name__ == "__main__": + main() + + +TRT_DATA_DIR = None + + +def getFilePath(path): + """Util to get the full path to the downloaded data files. + + It only works when the sample doesn't have any other command line argument. + """ + global TRT_DATA_DIR + if not TRT_DATA_DIR: + parser = argparse.ArgumentParser(description="Helper of data file download tool") + parser.add_argument( + "-d", + "--data", + help="Specify the data directory where it is saved in. $TRT_DATA_DIR will be overwritten by this argument.", + ) + args, _ = parser.parse_known_args() + TRT_DATA_DIR = os.environ.get("TRT_DATA_DIR", None) if args.data is None else args.data + if TRT_DATA_DIR is None: + raise ValueError("Data directory must be specified by either `-d $DATA` or environment variable $TRT_DATA_DIR.") + + fullpath = os.path.join(TRT_DATA_DIR, path) + if not os.path.exists(fullpath): + raise ValueError("Data file %s doesn't exist!" % fullpath) + + return fullpath \ No newline at end of file diff --git a/3-Basic-Tools-and-Getting-Started/3.4-TensorRT/scripts/model.py b/3-Basic-Tools-and-Getting-Started/3.4-TensorRT/scripts/model.py new file mode 100644 index 0000000..f6ba990 --- /dev/null +++ b/3-Basic-Tools-and-Getting-Started/3.4-TensorRT/scripts/model.py @@ -0,0 +1,145 @@ +# +# SPDX-FileCopyrightText: Copyright (c) 1993-2022 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + +# This file contains functions for training a PyTorch MNIST Model +import torch +import torch.nn as nn +import torch.nn.functional as F +import torch.optim as optim +from torchvision import datasets, transforms +from torch.autograd import Variable + +import numpy as np +import os + +from random import randint + +# Network +class Net(nn.Module): + def __init__(self): + super(Net, self).__init__() + self.conv1 = nn.Conv2d(1, 20, kernel_size=5) + self.conv2 = nn.Conv2d(20, 50, kernel_size=5) + self.fc1 = nn.Linear(800, 500) + self.fc2 = nn.Linear(500, 10) + + def forward(self, x): + x = F.max_pool2d(self.conv1(x), kernel_size=2, stride=2) + x = F.max_pool2d(self.conv2(x), kernel_size=2, stride=2) + x = x.view(-1, 800) + x = F.relu(self.fc1(x)) + x = self.fc2(x) + return F.log_softmax(x, dim=1) + + +class MnistModel(object): + def __init__(self): + self.batch_size = 64 + self.test_batch_size = 100 + self.learning_rate = 0.0025 + self.sgd_momentum = 0.9 + self.log_interval = 100 + # Fetch MNIST data set. + self.train_loader = torch.utils.data.DataLoader( + datasets.MNIST( + "/tmp/mnist/data", + train=True, + download=True, + transform=transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))]), + ), + batch_size=self.batch_size, + shuffle=True, + num_workers=1, + timeout=600, + ) + self.test_loader = torch.utils.data.DataLoader( + datasets.MNIST( + "/tmp/mnist/data", + train=False, + transform=transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))]), + ), + batch_size=self.test_batch_size, + shuffle=True, + num_workers=1, + timeout=600, + ) + self.network = Net() + if torch.cuda.is_available(): + self.network = self.network.to("cuda") + + # Train the network for one or more epochs, validating after each epoch. + def learn(self, num_epochs=2): + # Train the network for a single epoch + def train(epoch): + self.network.train() + optimizer = optim.SGD(self.network.parameters(), lr=self.learning_rate, momentum=self.sgd_momentum) + for batch, (data, target) in enumerate(self.train_loader): + if torch.cuda.is_available(): + data = data.to("cuda") + target = target.to("cuda") + data, target = Variable(data), Variable(target) + optimizer.zero_grad() + output = self.network(data) + loss = F.nll_loss(output, target) + loss.backward() + optimizer.step() + if batch % self.log_interval == 0: + print( + "Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}".format( + epoch, + batch * len(data), + len(self.train_loader.dataset), + 100.0 * batch / len(self.train_loader), + loss.data.item(), + ) + ) + + # Test the network + def test(epoch): + self.network.eval() + test_loss = 0 + correct = 0 + for data, target in self.test_loader: + with torch.no_grad(): + if torch.cuda.is_available(): + data = data.to("cuda") + target = target.to("cuda") + data, target = Variable(data), Variable(target) + output = self.network(data) + test_loss += F.nll_loss(output, target).data.item() + pred = output.data.max(1)[1] + correct += pred.eq(target.data).cpu().sum() + test_loss /= len(self.test_loader) + print( + "\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n".format( + test_loss, correct, len(self.test_loader.dataset), 100.0 * correct / len(self.test_loader.dataset) + ) + ) + + for e in range(num_epochs): + train(e + 1) + test(e + 1) + + def get_weights(self): + return self.network.state_dict() + + def get_random_testcase(self): + data, target = next(iter(self.test_loader)) + case_num = randint(0, len(data) - 1) + test_case = data.numpy()[case_num].ravel().astype(np.float32) + test_name = target.numpy()[case_num] + return test_case, test_name \ No newline at end of file diff --git a/3-Basic-Tools-and-Getting-Started/3.4-TensorRT/scripts/sample.py b/3-Basic-Tools-and-Getting-Started/3.4-TensorRT/scripts/sample.py new file mode 100644 index 0000000..c03ac19 --- /dev/null +++ b/3-Basic-Tools-and-Getting-Started/3.4-TensorRT/scripts/sample.py @@ -0,0 +1,161 @@ +# +# SPDX-FileCopyrightText: Copyright (c) 1993-2022 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + +import os +import sys + +# This sample uses an MNIST PyTorch model to create a TensorRT Inference Engine +import model +import numpy as np + +# Use autoprimaryctx if available (pycuda >= 2021.1) to +# prevent issues with other modules that rely on the primary +# device context. +try: + import pycuda.autoprimaryctx +except ModuleNotFoundError: + import pycuda.autoinit + +import tensorrt as trt + +# sys.path.insert(1, os.path.join(sys.path[0], "..")) +import common + +# You can set the logger severity higher to suppress messages (or lower to display more messages). +TRT_LOGGER = trt.Logger(trt.Logger.WARNING) + + +class ModelData(object): + INPUT_NAME = "data" + INPUT_SHAPE = (1, 1, 28, 28) + OUTPUT_NAME = "prob" + OUTPUT_SIZE = 10 + DTYPE = trt.float32 + + +def populate_network(network, weights): + # Configure the network layers based on the weights provided. + input_tensor = network.add_input(name=ModelData.INPUT_NAME, dtype=ModelData.DTYPE, shape=ModelData.INPUT_SHAPE) + + def add_matmul_as_fc(net, input, outputs, w, b): + assert len(input.shape) >= 3 + m = 1 if len(input.shape) == 3 else input.shape[0] + k = int(np.prod(input.shape) / m) + assert np.prod(input.shape) == m * k + n = int(w.size / k) + assert w.size == n * k + assert b.size == n + + input_reshape = net.add_shuffle(input) + input_reshape.reshape_dims = trt.Dims2(m, k) + + filter_const = net.add_constant(trt.Dims2(n, k), w) + mm = net.add_matrix_multiply( + input_reshape.get_output(0), + trt.MatrixOperation.NONE, + filter_const.get_output(0), + trt.MatrixOperation.TRANSPOSE, + ) + + bias_const = net.add_constant(trt.Dims2(1, n), b) + bias_add = net.add_elementwise(mm.get_output(0), bias_const.get_output(0), trt.ElementWiseOperation.SUM) + + output_reshape = net.add_shuffle(bias_add.get_output(0)) + output_reshape.reshape_dims = trt.Dims4(m, n, 1, 1) + return output_reshape + + conv1_w = weights["conv1.weight"].cpu().numpy() + conv1_b = weights["conv1.bias"].cpu().numpy() + conv1 = network.add_convolution( + input=input_tensor, num_output_maps=20, kernel_shape=(5, 5), kernel=conv1_w, bias=conv1_b + ) + conv1.stride = (1, 1) + + pool1 = network.add_pooling(input=conv1.get_output(0), type=trt.PoolingType.MAX, window_size=(2, 2)) + pool1.stride = (2, 2) + + conv2_w = weights["conv2.weight"].cpu().numpy() + conv2_b = weights["conv2.bias"].cpu().numpy() + conv2 = network.add_convolution(pool1.get_output(0), 50, (5, 5), conv2_w, conv2_b) + conv2.stride = (1, 1) + + pool2 = network.add_pooling(conv2.get_output(0), trt.PoolingType.MAX, (2, 2)) + pool2.stride = (2, 2) + + fc1_w = weights["fc1.weight"].cpu().numpy() + fc1_b = weights["fc1.bias"].cpu().numpy() + fc1 = add_matmul_as_fc(network, pool2.get_output(0), 500, fc1_w, fc1_b) + + relu1 = network.add_activation(input=fc1.get_output(0), type=trt.ActivationType.RELU) + + fc2_w = weights["fc2.weight"].cpu().numpy() + fc2_b = weights["fc2.bias"].cpu().numpy() + fc2 = add_matmul_as_fc(network, relu1.get_output(0), ModelData.OUTPUT_SIZE, fc2_w, fc2_b) + + fc2.get_output(0).name = ModelData.OUTPUT_NAME + network.mark_output(tensor=fc2.get_output(0)) + + +def build_engine(weights): + # For more information on TRT basics, refer to the introductory samples. + builder = trt.Builder(TRT_LOGGER) + network = builder.create_network(common.EXPLICIT_BATCH) + config = builder.create_builder_config() + runtime = trt.Runtime(TRT_LOGGER) + + config.max_workspace_size = common.GiB(1) + # Populate the network using weights from the PyTorch model. + populate_network(network, weights) + # Build and return an engine. + plan = builder.build_serialized_network(network, config) + return runtime.deserialize_cuda_engine(plan) + + +# Loads a random test case from pytorch's DataLoader +def load_random_test_case(model, pagelocked_buffer): + # Select an image at random to be the test case. + img, expected_output = model.get_random_testcase() + # Copy to the pagelocked input buffer + np.copyto(pagelocked_buffer, img) + return expected_output + + +def main(): + common.add_help(description="Runs an MNIST network using a PyTorch model") + # Train the PyTorch model + mnist_model = model.MnistModel() + mnist_model.learn() + weights = mnist_model.get_weights() + # Do inference with TensorRT. + engine = build_engine(weights) + + # Build an engine, allocate buffers and create a stream. + # For more information on buffer allocation, refer to the introductory samples. + inputs, outputs, bindings, stream = common.allocate_buffers(engine) + context = engine.create_execution_context() + + case_num = load_random_test_case(mnist_model, pagelocked_buffer=inputs[0].host) + # For more information on performing inference, refer to the introductory samples. + # The common.do_inference function will return a list of outputs - we only have one in this case. + [output] = common.do_inference_v2(context, bindings=bindings, inputs=inputs, outputs=outputs, stream=stream) + pred = np.argmax(output) + print("Test Case: " + str(case_num)) + print("Prediction: " + str(pred)) + + +if __name__ == "__main__": + main() \ No newline at end of file diff --git a/3-Basic-Tools-and-Getting-Started/3.5-Pytorch/README.md b/3-Basic-Tools-and-Getting-Started/3.5-Pytorch/README.md new file mode 100644 index 0000000..a106275 --- /dev/null +++ b/3-Basic-Tools-and-Getting-Started/3.5-Pytorch/README.md @@ -0,0 +1,133 @@ +# Pytorch + +## Introduction + +Deploying PyTorch models to embedded edge devices is a critical step in bringing AI applications to life. [NVIDIA's Jetson platform](https://www.seeedstudio.com/reComputer-J3010-p-5589.html), with its powerful GPU computing capabilities and comprehensive AI software stack, has become an ideal choice for running PyTorch models. + +However, because Jetson is based on the ARM architecture, which differs from common x86 server environments, setting up a PyTorch environment on it cannot be accomplished with a simple pip install command. Developers often face challenges such as finding the correct version of pre-compiled packages, managing complex dependencies, and performing necessary performance optimizations. + +This article aims to provide a clear and practical guide, focusing on how to quickly and correctly configure the PyTorch environment on the Jetson platform, helping you kickstart your PyTorch development journey on Jetson. + + +

+ + + +
+ + Image from: + pypi + +

+ + +## Installing PyTorch on reComputer Nvidia Jetson + +### Set Up Your Environment + +- JetPack 5/6: + Make sure you have NVIDIA JetPack 5 or 6 installed on your reComputer. JetPack includes the necessary libraries and tools for developing on NVIDIA Jetson platforms. + +- CUDA: + Verify that CUDA is installed and properly configured. PyTorch relies on CUDA for GPU acceleration. Ensure that the CUDA version installed is compatible with the PyTorch version you plan to install. + +> Type `cat /etc/nv_tegra_release` and `nvcc -V` in your terminal. If the returned content is similar to the screenshot below, it indicates that the corresponding environment has been properly installed in your Jetson. +>

+ +### Installing PyTorch Using a .whl File +To install PyTorch on your reComputer with the specified JetPack and CUDA versions, follow these steps: + +#### Download the PyTorch Wheel File + +Choose the correct wheel file based on your JetPack, CUDA and python version: + + + +- **JetPack 7**: + - [PyTorch 2.9](https://seeedstudio88-my.sharepoint.com/:u:/g/personal/youjiang_yu_seeedstudio88_onmicrosoft_com/EVe_c8F4DR9CluC049HCYoMBhHsIH0QMDQF2Q_kPEWOMcQ?e=xaW7vC) + - [Torchvision 0.24](https://seeedstudio88-my.sharepoint.com/:u:/g/personal/youjiang_yu_seeedstudio88_onmicrosoft_com/ESDkmxLfCW1MkI8YBfrdWVABKNXimZSq0qcDoApbzJazZw?e=8JOUMy) + +- **JetPack 6.1 & 6.2 (L4T R36.4) + CUDA 12.6**: + + - [PyTorch 2.7](https://seeedstudio88-my.sharepoint.com/:u:/g/personal/youjiang_yu_seeedstudio88_onmicrosoft_com/EW2ke8EPcVhGsM2mjCMQOWEBcEuOA45rxNmC0FlkBfhPPg?e=bJ0bOz) + + - [torchvision 0.22.0](https://seeedstudio88-my.sharepoint.com/:u:/g/personal/youjiang_yu_seeedstudio88_onmicrosoft_com/EXyam8X0U3VNqmio2ZBM6osB1IDAurgvkd6JAsJLnahTcA?e=p85hdo) + + - [PyTorch 2.5](https://seeedstudio88-my.sharepoint.com/:u:/g/personal/youjiang_yu_seeedstudio88_onmicrosoft_com/EXZ8MsYzCYdGjR9g3tQwnAIBj1kJodvl-9XQVa9U8XFPZA?e=QJntmH) + + if `ImportError: libcusparseLt.so.0: cannot open shared object file: No such file or directory`, install new version [cuSPARSELt 0.8.1](https://developer.nvidia.com/cusparselt-downloads?target_os=Linux&target_arch=aarch64-jetson&Compilation=Native&Distribution=Ubuntu&target_version=22.04&target_type=deb_local) (Select Linux>arrch64-jetson>Native>Ubuntu>22.04>deb(Local)) and [CUDA 12.6](https://developer.nvidia.com/cuda-12-6-0-download-archive?target_os=Linux&target_arch=aarch64-jetson&Compilation=Native&Distribution=Ubuntu&target_version=22.04&target_type=deb_local) (Select Linux>arrch64-jetson>Native>Ubuntu>22.04>deb(Local)) + + - [torchvision 0.20](https://seeedstudio88-my.sharepoint.com/:u:/g/personal/youjiang_yu_seeedstudio88_onmicrosoft_com/EbaNmRnWK9BHiYRpX4G1VdYBCsxh9qtdQHtsxEN5nAUJhw?e=liIOJ0) + + If torchvision reports an error, please uninstall it and refer to the subsequent steps to compile torchvision 0.20.0 via code. + + + + also [Torch-TensorRT in JetPack](https://docs.pytorch.org/TensorRT/getting_started/jetpack.html) + +- **JetPack 6.0 (L4T R36.2 / R36.3) + CUDA 12.2**: + - [PyTorch 2.3](https://seeedstudio88-my.sharepoint.com/:u:/g/personal/youjiang_yu_seeedstudio88_onmicrosoft_com/Ed30wfVKAydNqwsmo_qICDwBu_mcOJ4S3jTuyI11nNer8A?e=tjDWmv) rename to `torch-2.3.0-cp310-cp310-linux_aarch64.whl` + - [torchvision 0.18](https://seeedstudio88-my.sharepoint.com/:u:/g/personal/youjiang_yu_seeedstudio88_onmicrosoft_com/EY4l6JaUyc5JlPCwlgOUFw0BeVeilxUjcqZLoK4M7WH3TQ?e=lMxO11) rename to `torchvision-0.18.0a0+6043bc2-cp310-cp310-linux_aarch64.whl` + +- **JetPack 6.0 DP (L4T R36.2.0)**: + - [PyTorch 2.2.0](https://developer.download.nvidia.cn/compute/redist/jp/v60dp/pytorch/torch-2.2.0a0+6a974be.nv23.11-cp310-cp310-linux_aarch64.whl) + - [PyTorch 2.1.0](https://nvidia.box.com/shared/static/0h6tk4msrl9xz3evft9t0mpwwwkw7a32.whl) + +- **JetPack 5.x**: + - **JetPack 5.1 (L4T R35.2.1) / JetPack 5.1.1 (L4T R35.3.1) / JetPack 5.1.2 (L4T R35.4.1)**: + - [PyTorch 2.1.0](https://seeedstudio88-my.sharepoint.com/:u:/g/personal/youjiang_yu_seeedstudio88_onmicrosoft_com/EbQ3kJ4pMKRCm5AN6wbDCxAB1l2NUKUQ-7R57XY6E6sHHg?e=h8mbr9) + - **JetPack 5.1 (L4T R35.2.1) / JetPack 5.1.1 (L4T R35.3.1)**: + - [PyTorch 2.0.0](https://nvidia.box.com/shared/static/i8pukc49h3lhak4kkn67tg9j4goqm0m7.whl) + - [PyTorch 1.14.0](https://developer.download.nvidia.com/compute/redist/jp/v51/pytorch/torch-1.14.0a0+44dac51c.nv23.02-cp38-cp38-linux_aarch64.whl) + - **JetPack 5.0 (L4T R34.1) / JetPack 5.0.2 (L4T R35.1) / JetPack 5.1 (L4T R35.2.1) / JetPack 5.1.1 (L4T R35.3.1)**: + - [PyTorch 1.13.0](https://developer.download.nvidia.com/compute/redist/jp/v502/pytorch/torch-1.13.0a0+d0d6b1f2.nv22.10-cp38-cp38-linux_aarch64.whl) + - [PyTorch 1.12.0](https://developer.download.nvidia.com/compute/redist/jp/v50/pytorch/torch-1.12.0a0+2c916ef.nv22.3-cp38-cp38-linux_aarch64.whl) + - [PyTorch 1.11.0](https://nvidia.box.com/shared/static/ssf2v7pf5i245fk4i0q926hy4imzs2ph.whl) + +#### Install the Wheel File + +1. **Open a Terminal**: + - Navigate to the directory where you downloaded the `.whl` file. + +2. **Install**: + ```bash + sudo apt-get install python3-pip libopenblas-base libopenmpi-dev libomp-dev + pip3 install 'Cython<3' + pip3 install numpy + sudo pip3 install .whl + ``` + Replace `` with the name of the downloaded `.whl` file. +

+ + + +

+ +#### Verify Installation +To verify that PyTorch has been installed correctly on your system, launch an interactive Python interpreter from the terminal and run the following commands: + + ```python + import torch + print(torch.__version__) + print('CUDA available: ' + str(torch.cuda.is_available())) + print('cuDNN version: ' + str(torch.backends.cudnn.version())) + a = torch.cuda.FloatTensor(2).zero_() + print('Tensor a = ' + str(a)) + b = torch.randn(2).cuda() + print('Tensor b = ' + str(b)) + c = a + b + print('Tensor c = ' + str(c)) + ``` + ```python + import torchvision + print(torchvision.__version__) + ``` +

+ + +## More Tutorial Content + +**Tutorial** | **Type** | **Description** +--- | --- | --- +[Official PyTorch Tutorial](https://pytorch.org/tutorials/beginner/basics/intro.html) | doc | An official PyTorch tutorial that provides a complete learning path. +[PyTorch Development Documentation](https://pytorch.org/docs/stable/index.html) | doc | Official PyTorch development documentation provided by PyTorch. diff --git a/3-Basic-Tools-and-Getting-Started/3.5-Pytorch/images/fig1.png b/3-Basic-Tools-and-Getting-Started/3.5-Pytorch/images/fig1.png new file mode 100644 index 0000000..39d5e5d Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.5-Pytorch/images/fig1.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.5-Pytorch/images/fig2.png b/3-Basic-Tools-and-Getting-Started/3.5-Pytorch/images/fig2.png new file mode 100644 index 0000000..0d2f1d5 Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.5-Pytorch/images/fig2.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.5-Pytorch/images/fig3.png b/3-Basic-Tools-and-Getting-Started/3.5-Pytorch/images/fig3.png new file mode 100644 index 0000000..73c24a9 Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.5-Pytorch/images/fig3.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.5-Pytorch/images/fig4.png b/3-Basic-Tools-and-Getting-Started/3.5-Pytorch/images/fig4.png new file mode 100644 index 0000000..3046896 Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.5-Pytorch/images/fig4.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.6-Tensorflow/README.md b/3-Basic-Tools-and-Getting-Started/3.6-Tensorflow/README.md new file mode 100644 index 0000000..bb296e2 --- /dev/null +++ b/3-Basic-Tools-and-Getting-Started/3.6-Tensorflow/README.md @@ -0,0 +1 @@ +# Tensorflwo \ No newline at end of file diff --git a/3-Basic-Tools-and-Getting-Started/3.7-Docker/README.md b/3-Basic-Tools-and-Getting-Started/3.7-Docker/README.md new file mode 100644 index 0000000..1bc5eda --- /dev/null +++ b/3-Basic-Tools-and-Getting-Started/3.7-Docker/README.md @@ -0,0 +1,234 @@ +# Install Docker + +​ Docker is an open platform for building, shipping, and running applications using lightweight, portable containers. A container bundles your app with all its dependencies and runtime into an image, so it runs the same way on any machine—your laptop, a server, or the cloud—without ā€œit works on my machineā€ issues. Unlike virtual machines, containers share the host OS kernel, making them faster to start and more resource‑efficient. Developers define images with a simple Dockerfile, publish or pull them from registries like Docker Hub, and orchestrate multi‑service setups with Docker Compose. This approach streamlines development, testing, and deployment, improves isolation and security, and makes scaling or rolling back versions straightforward. + +image-20250918145816792 + +## Install the Docker service on Jetson + +------ + +### Install Docker CE + +First, update the apt package index: + +```bash +sudo apt update +sudo apt install -y apt-transport-https ca-certificates curl software-properties-common +``` + +Add the official Docker GPG key: + +```bash +curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg +``` + +Add the official Docker repository: + +``` +echo \ + "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] \ + https://download.docker.com/linux/ubuntu \ + $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null +``` + +Install Docker CE: + +``` +sudo apt update +sudo apt install -y docker-ce docker-ce-cli containerd.io +``` + +Check the installation: + +``` +docker --version +``` + +------ + +![image-20250917091343874](images/image-20250917091343874.png) + +If the version number can be displayed normally, it indicates that the installation was successful. + +### Install NVIDIA Container Toolkit + +​ **NVIDIA Container Toolkit** is a software suite that enables Docker containers to access NVIDIA GPUs. While Docker itself provides containerization for applications and their dependencies, it does not natively support GPU acceleration. The NVIDIA Container Toolkit bridges this gap by allowing containers to utilize the host’s GPU hardware. It maps NVIDIA drivers, CUDA libraries, and other necessary components into the container, enabling GPU-accelerated workloads such as AI, deep learning, and high-performance computing. + +Add NVIDIA GPG key: + +```bash +sudo apt-get install curl -y + +distribution=$(. /etc/os-release;echo $ID$VERSION_ID) + +curl -s -L https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg + +curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \ + sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \ + sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list +``` + +![image-20250917091537022](images/image-20250917091537022.png) + +Install NVIDIA Container Toolkit: + +```bash +sudo apt update +sudo apt install -y nvidia-container-toolkit +``` + +Configure Docker to use NVIDIA runtime: + +``` +sudo nvidia-ctk runtime configure --runtime=docker +sudo systemctl restart docker +``` + +------ + +Test whether the GPU can be used in the Docker container. + +```bash +sudo docker run --rm --runtime=nvidia --gpus all --network host ubuntu nvidia-smi +``` + +![image-20250917103828047](images/image-20250917103828047.png) + +If the above output indicates that CUDA can be used normally within the Docker image, then it is possible to use CUDA within the Docker container. + +### [Optional]: Allow non-root users to run Docker + +``` +sudo usermod -aG docker $USER +newgrp docker +``` + +After that, you can directly use `docker run` instead of `sudo`. + +## Common operation commands of docker + +### Pull a docker image from docker hub + +```bash +docker pull ultralytics/ultralytics:8.3.201-jetson-jetpack6 +``` + +![image-20250917133331771](images/image-20250917133331771.png) + +View the pulled image: + +```bash +docker image list +``` + +![image-20250917133428648](images/image-20250917133428648.png) + +enter docker container: + +```bash +docker run --runtime nvidia -it --rm --network=host ultralytics/ultralytics:8.3.201-jetson-jetpack6 +``` + +- **--runtime** nvidia Specify that the container uses **NVIDIA Container Runtime**, so that the container can access the GPU of Jetson. +- **-it** `-i` Indicating the interactive mode, `-t` assigns a pseudo-terminal. +- **--rm** The container will be automatically deleted after it exits. +- **--network=host** Allow the container to directly use the host machine's network. + + + +Enter the container and verify whether CUDA can be called normally. + +![image-20250918134423865](images/image-20250918134423865.png) + +It can be seen that CUDA can be called normally. + +```bash +#Check the name of the Docker container +docker ps -a +``` + +![image-20250918103244482](images/image-20250918103244482.png) + +### File transfer + +This part is designed to enable the transfer of host files to the container interior or the transfer of files from the container interior to the host. + +```bash +#jetson to docker image +docker cp : +#example +docker cp ./test_file.txt lucid_carson:/ultralytics +``` + +![image-20250918103334077](images/image-20250918103334077.png) + +![image-20250918103359703](images/image-20250918103359703.png) + +``` +#docker to jetson +docker cp : +#example +docker cp lucid_carson:/ultralytics/container_file.txt /home/seeed +``` + +![image-20250918103619026](images/image-20250918103619026.png) + +### Package the image + +```bash +#Check the container name +docker ps -a +``` + +![image-20250918142632980](images/image-20250918142632980.png) + +Package it as an image file: + +```bash + #docker commit + docker commit 4fc69a374239 my_container:latest +``` + + This will create a new image. + +![image-20250918142913624](images/image-20250918142913624.png) + +å°†ę–°åˆ›å»ŗēš„é•œåƒčæ›č”Œę‰“åŒ… + +```bash +#docker save -o +docker save -o my_container.tar my_container:latest +``` + +![image-20250918144309547](images/image-20250918144309547.png) + +After the packaging is completed, a.tar file will be generated. + +### Load the.tar image + +```bash +docker load -i my_container.tar +``` + +### Push the image to Docker Hub + +```bash +#ē™»å½•åˆ°docker hub +#docker login -u +docker tag my_containe:latest your-dockerhub-username/my-ultralytics:latest +docker push your-dockerhub-username/my-ultralytics:latest + +``` + +### Delete the image + +```bash +docker rmi my_containe:latest +``` + + + +## Reference materials + +If you want to learn more about Docker operations, please refer to [here](https://docs.docker.com/guides/) \ No newline at end of file diff --git a/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250917091102801.png b/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250917091102801.png new file mode 100644 index 0000000..7ed0be1 Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250917091102801.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250917091326075.png b/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250917091326075.png new file mode 100644 index 0000000..d7dc35e Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250917091326075.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250917091343874.png b/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250917091343874.png new file mode 100644 index 0000000..598557f Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250917091343874.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250917091537022.png b/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250917091537022.png new file mode 100644 index 0000000..081f48f Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250917091537022.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250917091901861.png b/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250917091901861.png new file mode 100644 index 0000000..f0e3ef7 Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250917091901861.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250917092045343.png b/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250917092045343.png new file mode 100644 index 0000000..248ba36 Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250917092045343.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250917092208852.png b/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250917092208852.png new file mode 100644 index 0000000..bb76c59 Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250917092208852.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250917095005105.png b/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250917095005105.png new file mode 100644 index 0000000..47db32d Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250917095005105.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250917095426254.png b/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250917095426254.png new file mode 100644 index 0000000..1280f12 Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250917095426254.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250917095500283.png b/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250917095500283.png new file mode 100644 index 0000000..ee438ac Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250917095500283.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250917095519854.png b/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250917095519854.png new file mode 100644 index 0000000..1cefa52 Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250917095519854.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250917103828047.png b/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250917103828047.png new file mode 100644 index 0000000..a405d49 Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250917103828047.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250917133331771.png b/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250917133331771.png new file mode 100644 index 0000000..3d6fa39 Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250917133331771.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250917133428648.png b/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250917133428648.png new file mode 100644 index 0000000..22204c1 Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250917133428648.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250917134308736.png b/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250917134308736.png new file mode 100644 index 0000000..b8aac14 Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250917134308736.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250918103244482.png b/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250918103244482.png new file mode 100644 index 0000000..8ffba06 Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250918103244482.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250918103334077.png b/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250918103334077.png new file mode 100644 index 0000000..df5f465 Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250918103334077.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250918103359703.png b/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250918103359703.png new file mode 100644 index 0000000..46715bb Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250918103359703.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250918103619026.png b/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250918103619026.png new file mode 100644 index 0000000..34ece1d Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250918103619026.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250918134423865.png b/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250918134423865.png new file mode 100644 index 0000000..fcc0a0e Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250918134423865.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250918142632980.png b/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250918142632980.png new file mode 100644 index 0000000..1a804e4 Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250918142632980.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250918142913624.png b/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250918142913624.png new file mode 100644 index 0000000..5045ebf Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250918142913624.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250918144309547.png b/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250918144309547.png new file mode 100644 index 0000000..dd0f4b7 Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250918144309547.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250918145816792.png b/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250918145816792.png new file mode 100644 index 0000000..5742b91 Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.7-Docker/images/image-20250918145816792.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.8-OpenCV-with-CUDA/README.md b/3-Basic-Tools-and-Getting-Started/3.8-OpenCV-with-CUDA/README.md new file mode 100644 index 0000000..57a0ca9 --- /dev/null +++ b/3-Basic-Tools-and-Getting-Started/3.8-OpenCV-with-CUDA/README.md @@ -0,0 +1,204 @@ +# Opencv with CUDA + +## Introduction + +​ **OpenCV (Open Source Computer Vision Library)** is an open-source software library designed for computer vision and image processing. It provides hundreds of optimized algorithms for tasks such as image and video analysis, object detection, facial recognition, feature extraction, camera calibration, and 3D reconstruction. + +​ JetPack comes with OpenCV pre-installed on reComputer devices, but the default version does **not include CUDA acceleration**. This means that while you can use OpenCV for standard image processing, GPU-accelerated functions (like `cv::cuda::GpuMat` operations, DNN inference, or real-time video processing) will not benefit from the Jetson's GPU. + +By compiling OpenCV from source with **CUDA support**, you can leverage the GPU for many computer vision tasks, which significantly improves performance for: + +- **Real-time video processing** (e.g., high-resolution camera streams) +- **Deep learning inference** using OpenCV DNN module +- **Image transformations and filters** that can be parallelized on the GPU +- **Computer vision pipelines in robotics or autonomous systems**, where low latency and high throughput are critical + +So, compiling OpenCV with CUDA on Jetson allows you to fully utilize the GPU for accelerated computer vision workloads, making applications faster and more efficient. + +This tutorial will guide you step by step on how to install the version of OpenCV that supports CUDA. + +![image-20250918172500287](images/image-20250918172500287.png) + +> Note:The demonstration environment is Jetpack 6.2 + +## Compiling OpenCV with CUDA + +**Step 1.** Uninstall the Python packages and library files + +```bash +pip list | grep opencv +pip3 uninstall opencv-python +sudo apt purge libopencv* +sudo apt autoremove +sudo apt update +``` + +**Step 2.** Install the required dependent libraries for the subsequent compilation process. + +> **Note**:The following dependencies are for the Jetpack 6.2 Ubuntu 22.04 system. The names of dependencies in other environments may be different! + +```bash +sudo apt install -y build-essential checkinstall cmake pkg-config yasm git gfortran +sudo apt update +sudo apt install -y libgstreamer1.0-dev libgstreamer-plugins-base1.0-dev +sudo apt install -y libjpeg8-dev libpng-dev libtiff5-dev libavcodec-dev libavformat-dev libswscale-dev libxine2-dev libv4l-dev libdc1394-dev libopenjp2-7-dev +sudo apt install -y libgstreamer1.0-dev libgstreamer-plugins-base1.0-dev libgtk2.0-dev libtbb-dev libatlas-base-dev libfaac-dev libmp3lame-dev libtheora-dev libvorbis-dev libxvidcore-dev libopencore-amrnb-dev libopencore-amrwb-dev x264 v4l-utils +sudo apt install -y \ + python3-dev python3-numpy \ + libtbb2 libtbb-dev libjpeg-dev libpng-dev libtiff-dev \ + libavcodec-dev libavformat-dev libswscale-dev libv4l-dev \ + libxvidcore-dev libx264-dev libgtk-3-dev libcanberra-gtk3-dev \ + libatlas-base-dev gfortran +``` + +**Step 3.** Download the official source code. + +Here, we are using version 4.10.0 of [OpenCV](https://github.com/opencv/opencv/releases) and [OpenCV_contrib](https://github.com/opencv/opencv_contrib/releases/tag/4.10.0). + +![image-20250920135447788](images/image-20250920135447788.png) + +```bash +wget https://github.com/opencv/opencv/archive/refs/tags/4.10.0.zip +unzip 4.10.0.zip +rm 4.10.0.zip +wget https://github.com/opencv/opencv_contrib/archive/refs/tags/4.10.0.zip +unzip 4.10.0.zip -d opencv-4.10.0/ +cd opencv-4.10.0 +``` + +**Step 4.** Build and install Opencv. + +```bash +mkdir build +cd build +export PATH=/usr/local/cuda-12.6/bin:$PATH && export LD_LIBRARY_PATH=/usr/local/cuda-12.6/lib64:$LD_LIBRARY_PATH + +cmake \ +-DCMAKE_BUILD_TYPE=Release \ +-DCMAKE_INSTALL_PREFIX=/usr/local \ +-DOPENCV_ENABLE_NONFREE=ON \ +-DWITH_FFMPEG=ON \ +-DWITH_CUDA=ON \ +-DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda-12.6 \ +-DCUDA_ARCH_BIN=8.7 \ +-DCUDA_ARCH_PTX="" \ +-DENABLE_FAST_MATH=ON \ +-DCUDA_FAST_MATH=ON \ +-DWITH_CUBLAS=ON \ +-DOPENCV_GENERATE_PKGCONFIG=ON \ +-DBUILD_opencv_python2=OFF \ +-DBUILD_opencv_python3=ON \ +-DPYTHON3_EXECUTABLE=$(which python3) \ +-DPYTHON3_INCLUDE_DIR=$(python3 -c "from sysconfig import get_paths as gp; print(gp()['include'])") \ +-DPYTHON3_LIBRARY=$(python3 -c "import sysconfig; print(sysconfig.get_config_var('LIBDIR') + '/libpython' + sysconfig.get_python_version() + '.so')") \ +-DOPENCV_EXTRA_MODULES_PATH=/home/seeed/Downloads/opencv_contrib/modules \ +.. + +``` + +> **-DWITH_FFMPEG=ON** Enabling FFMPEG support enables OpenCV to read and write various video formats (avi, mp4, mkv). If it is turned off, only MJPEG/RAW video streams can be accessed. +> +> **-DWITH_CUDA=ON** Enable CUDA acceleration +> +> **-DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda** Specify the CUDA installation path +> +> **-DCUDA_ARCH_BIN=8.7** Specify the computing power of the designated target GPU. Different modules are listed in the table below. +> +> | Jetson Model | GPU Architecture | SM Count | Compute Capability (CUDA_ARCH_BIN) | +> | --------------------- | ---------------- | -------- | ---------------------------------- | +> | **Jetson Nano** | Maxwell | 128 | **5.3** | +> | **Jetson TX1** | Maxwell | 256 | **5.3** | +> | **Jetson TX2** | Pascal | 256 | **6.2** | +> | **Jetson Xavier NX** | Volta | 384 | **7.2** | +> | **Jetson AGX Xavier** | Volta | 512 | **7.2** | +> | **Jetson Orin Nano** | Ampere | 512 | **8.7** | +> | **Jetson Orin NX** | Ampere | 1024 | **8.7** | +> | **Jetson AGX Orin** | Ampere | 2048 | **8.7** | +> +> **-DPYTHON3_EXECUTABLE=$(which python3)** +> **-DPYTHON3_INCLUDE_DIR** +> **-DPYTHON3_LIBRARY** +> +> Specify the path of Python3, header files, and dynamic libraries. If a virtual environment is installed, you need to modify it to the path of the compiler of your own virtual environment. + +During the CMake process, some configuration information is displayed. + +![image-20250919151723696](images/image-20250919151723696.png) + +**Step 5.** Build the project and install. + +```bash +#The compilation process may take a while! Please wait patiently! +make -j4 +``` + +![5f2e3b7cb50af1e7c09fa111170bfd0d](images/5f2e3b7cb50af1e7c09fa111170bfd0d.png) + +```bash +sudo make install +``` + +![545c9f009141bf2d99fd8f37942328b4](images/545c9f009141bf2d99fd8f37942328b4.png) + +**Step 6.** Verify installation. + +After the installation is completed, open jtop. You will see that "opencv with cuda" has changed to "yes". + +```bash +jtop +``` + +![image-20250920135806440](images/image-20250920135806440.png) + +Test whether Python can successfully import OpenCV.Use the following Python test script: + +```python +#python +import cv2 +import numpy as np + +print("OpenCV version:", cv2.__version__) + +# Check if OpenCV has the CUDA module installed +print("CUDA available in OpenCV:", cv2.cuda.getCudaEnabledDeviceCount() > 0) + +if cv2.cuda.getCudaEnabledDeviceCount() > 0: + print("Number of CUDA devices:", cv2.cuda.getCudaEnabledDeviceCount()) + cv2.cuda.setDevice(0) + print("CUDA device set to 0") + + # Test the GPU acceleration function + print("Creating test image...") + img = (np.random.rand(1080, 1920, 3) * 255).astype("uint8") + print("Test image shape:", img.shape) + + # Upload to GPU + gpu_img = cv2.cuda_GpuMat() + gpu_img.upload(img) + print("Image uploaded to GPU successfully!") + + # Perform Gaussian blurring on the GPU + print("Performing Gaussian blur on GPU...") + gaussian = cv2.cuda.createGaussianFilter(gpu_img.type(), gpu_img.type(), (15, 15), 0) + gpu_result = gaussian.apply(gpu_img) + + # Download result + result = gpu_result.download() + print("Gaussian blur completed on GPU!") + print("Result shape:", result.shape) + print("āœ… OpenCV 4.10.0 with CUDA 12.6 support is working correctly!") + +else: + print("āŒ No CUDA device found or OpenCV not built with CUDA.") +``` + +![98c824a91fa3d3d677206575085f34dc](images/98c824a91fa3d3d677206575085f34dc.png) + + + +## Resource + +https://github.com/opencv + +https://docs.opencv.org/4.10.0/d2/dbc/cuda_intro.html + diff --git a/3-Basic-Tools-and-Getting-Started/3.8-OpenCV-with-CUDA/images/545c9f009141bf2d99fd8f37942328b4.png b/3-Basic-Tools-and-Getting-Started/3.8-OpenCV-with-CUDA/images/545c9f009141bf2d99fd8f37942328b4.png new file mode 100644 index 0000000..67e6624 Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.8-OpenCV-with-CUDA/images/545c9f009141bf2d99fd8f37942328b4.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.8-OpenCV-with-CUDA/images/5f2e3b7cb50af1e7c09fa111170bfd0d.png b/3-Basic-Tools-and-Getting-Started/3.8-OpenCV-with-CUDA/images/5f2e3b7cb50af1e7c09fa111170bfd0d.png new file mode 100644 index 0000000..4ab7bc9 Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.8-OpenCV-with-CUDA/images/5f2e3b7cb50af1e7c09fa111170bfd0d.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.8-OpenCV-with-CUDA/images/98c824a91fa3d3d677206575085f34dc.png b/3-Basic-Tools-and-Getting-Started/3.8-OpenCV-with-CUDA/images/98c824a91fa3d3d677206575085f34dc.png new file mode 100644 index 0000000..70f45bc Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.8-OpenCV-with-CUDA/images/98c824a91fa3d3d677206575085f34dc.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.8-OpenCV-with-CUDA/images/image-20250918170605672.png b/3-Basic-Tools-and-Getting-Started/3.8-OpenCV-with-CUDA/images/image-20250918170605672.png new file mode 100644 index 0000000..f00bf83 Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.8-OpenCV-with-CUDA/images/image-20250918170605672.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.8-OpenCV-with-CUDA/images/image-20250918172500287.png b/3-Basic-Tools-and-Getting-Started/3.8-OpenCV-with-CUDA/images/image-20250918172500287.png new file mode 100644 index 0000000..382323f Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.8-OpenCV-with-CUDA/images/image-20250918172500287.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.8-OpenCV-with-CUDA/images/image-20250919151723696.png b/3-Basic-Tools-and-Getting-Started/3.8-OpenCV-with-CUDA/images/image-20250919151723696.png new file mode 100644 index 0000000..9b730b0 Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.8-OpenCV-with-CUDA/images/image-20250919151723696.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.8-OpenCV-with-CUDA/images/image-20250920135447788.png b/3-Basic-Tools-and-Getting-Started/3.8-OpenCV-with-CUDA/images/image-20250920135447788.png new file mode 100644 index 0000000..d08af4c Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.8-OpenCV-with-CUDA/images/image-20250920135447788.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.8-OpenCV-with-CUDA/images/image-20250920135806440.png b/3-Basic-Tools-and-Getting-Started/3.8-OpenCV-with-CUDA/images/image-20250920135806440.png new file mode 100644 index 0000000..776e803 Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.8-OpenCV-with-CUDA/images/image-20250920135806440.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.9-ROS/README.md b/3-Basic-Tools-and-Getting-Started/3.9-ROS/README.md new file mode 100644 index 0000000..c44adf3 --- /dev/null +++ b/3-Basic-Tools-and-Getting-Started/3.9-ROS/README.md @@ -0,0 +1,186 @@ +# Install ROS in reComputer + +## Introduction + +[**ROS (Robot Operating System)**](https://www.ros.org/) is an open-source framework for building robot software. It is not actually an operating system, but rather a flexible middleware that provides tools, libraries, and conventions to simplify the development of complex and distributed robotic systems. + +Key features of ROS include: + +- **Modular Design** – It organizes software into packages and nodes, making code reusable and easy to maintain. +- **Communication Infrastructure** – ROS uses a publisher/subscriber model for message passing and a service/action system for request/response interactions between nodes. +- **Simulation and Visualization** – Tools like RViz and Gazebo allow developers to visualize sensor data and simulate robots in a virtual environment. +- **Hardware Abstraction** – ROS provides drivers for many sensors and actuators, making it easier to integrate real-world hardwar![image-20250918154012205](images/image-20250918154012205.png) + +This tutorial will show you how to install the development environment for ROS1 and ROS2 on [reComputer](https://www.seeedstudio.com/reComputer-Robotics-J4012-with-GMSL-extension-board-p-6537.html). + +> **Note:**Please select different ROS versions based on the version of Ubuntu in your system. + +| **JetPack Version** | **Ubuntu Version** | **Recommended ROS Version** | **Use Case** | +| ------------------- | ------------------ | --------------------------- | ------------ | + +| **JetPack 5.x** | Ubuntu 20.04 | ROS1 Noetic āœ… ROS2 Foxy āœ… | Best choice for projects needing ROS1/ROS2 hybrid development, long-term support, and stable ecosystem. | +| --------------- | ------------ | :-------------------------: | ------------------------------------------------------------ | + +| **JetPack 6.x** | Ubuntu 22.04 | ROS2 Humble āœ… | Ideal for new projects, long-term maintenance, and leveraging Jetson Orin / newer hardware features. ROS1 not officially available (use Docker or build from source if needed). | +| :-------------- | ------------ | --------------- | ------------------------------------------------------------ | + +![img](images/1-100001302_recomputer_robotics_j3011_with_gmsl_extension.jpg) + +## Install ROS1 + +​ **ROS1 (Robot Operating System 1)** is an open-source framework for building robot software. It uses a central master (`roscore`) and a node-based architecture, where nodes communicate via topics, services, and actions. ROS1 provides tools for visualization, simulation, and data logging, and is widely used in robotics research and prototyping. + +**Step 1. **Setup your computer to accept software from packages.ros.org. + +```bash +sudo sh -c '. /etc/lsb-release && echo "deb http://mirrors.ustc.edu.cn/ros/ubuntu/ `lsb_release -cs` main" > /etc/apt/sources.list.d/ros-latest.list' +``` + +Note:Please setting up your sources.list file based on the region where you are located.Follow by this [artical](https://wiki.ros.org/ROS/Installation/UbuntuMirrors#China) + +**Step 2.** Set up your keys. + +```bash +sudo apt install curl gnupg2 lsb-release -y # if you haven't already installed curl + +curl -s https://raw.githubusercontent.com/ros/rosdistro/master/ros.asc | sudo apt-key add - +``` + +![image-20250918155614312](images/image-20250918155614312.png) + +**Step 3.** Installation. + +```bash +sudo apt update +sudo apt install ros-noetic-desktop-full +sudo apt-get install python3-rosdep +sudo rosdep init +rosdep update +``` + +**Step 4.** Set Up ROS Environment. + +```bash +echo "source /opt/ros/noetic/setup.bash">> ~/.bashrc && +source ~/.bashrc +``` + +**Step 5.** Install Dependency Tools. + +```bash +sudo apt install python3-rosinstall python3-rosinstall-generator python3-wstool build-essential +``` + +**Step 6.** Test the Installation. + +```bash +roscore +``` + +![img](images/fig2.png) + +More details you can refer to [here](https://wiki.ros.org/noetic/Installation/Ubuntu) + +## Install ROS2 + +​ **ROS2 (Robot Operating System 2)** is the second generation of the open-source robotics framework, designed to overcome the limitations of ROS1. It is built on top of DDS (Data Distribution Service) for reliable, real-time, and secure communication. ROS2 supports multi-robot systems, works on multiple platforms (Linux, Windows, macOS), and offers improved performance, security, and scalability compared to ROS1. + +**Step 1.** Set locale. + +```bash +locale # check for UTF-8 +``` + +![image-20250918161452733](images/image-20250918161452733.png) + +```bash +sudo apt update && sudo apt install locales +sudo locale-gen en_US en_US.UTF-8 +sudo update-locale LC_ALL=en_US.UTF-8 LANG=en_US.UTF-8 +export LANG=en_US.UTF-8 +locale # verify settings +``` + +**Step 2.** You will need to add the ROS 2 apt repository to your system. + +```bash +sudo apt install software-properties-common +sudo add-apt-repository universe +``` + +**Step 3.** Add the official ROS2 apt repository to your Ubuntu system. + +```bash +sudo apt update && sudo apt install curl -y +export ROS_APT_SOURCE_VERSION=$(curl -s https://api.github.com/repos/ros-infrastructure/ros-apt-source/releases/latest | grep -F "tag_name" | awk -F\" '{print $4}') +curl -L -o /tmp/ros2-apt-source.deb "https://github.com/ros-infrastructure/ros-apt-source/releases/download/${ROS_APT_SOURCE_VERSION}/ros2-apt-source_${ROS_APT_SOURCE_VERSION}.$(. /etc/os-release && echo $VERSION_CODENAME)_all.deb" # If using Ubuntu derivates use $UBUNTU_CODENAME +sudo dpkg -i /tmp/ros2-apt-source.deb +``` + +**Step 4.** Install ROS 2 packages. + +```bash +sudo apt update +sudo apt install ros-humble-desktop + +``` + +**Step 5.** Install Additional Build Tools. + +```bash +sudo apt install ros-dev-tools +``` + +**Step 6.** Initialize ROS Environment. + +```bash +sudo rosdep init +rosdep update +``` + +**Step 7.** Set Up ROS Environment Variables. + +```bash +echo "source /opt/ros/humble/setup.bash" >> ~/.bashrc +source ~/.bashrc +``` + +**Step 8.** Try some examples to verify the installation. + +​ In one terminal, source the setup file and then run a C++ `talker`: + +```bash +source /opt/ros/humble/setup.bash +ros2 run demo_nodes_cpp talker +``` + +​ In another terminal source the setup file and then run a Python `listener`: + +```bash +source /opt/ros/humble/setup.bash +ros2 run demo_nodes_py listener +``` + +![image-20250918164335833](images/image-20250918164335833.png) + +![image-20250918164343330](images/image-20250918164343330.png) + +​ Now, congratulations! šŸŽ‰You have successfully installed ROS2. You can now embark on a pleasant development journey. + +More details you can refer to [here](https://docs.ros.org/en/humble/Installation/Ubuntu-Install-Debs.html) + + + +## Resource + +[ROS1 Tutorials](https://wiki.ros.org/ROS/Tutorials) + +[ROS2 Tutorials](https://docs.ros.org/en/humble/Tutorials.html) + +**Related demo:** + +[A-LOAM 3D SLAM](https://wiki.seeedstudio.com/a_loam/) + +[Isaac ROS Visual SLAM](https://wiki.seeedstudio.com/isaac_ros_visual_slam/) + +[Control PX4 with reComputer](https://wiki.seeedstudio.com/control_px4_with_recomputer_jetson/) \ No newline at end of file diff --git a/3-Basic-Tools-and-Getting-Started/3.9-ROS/images/1-100001302_recomputer_robotics_j3011_with_gmsl_extension.jpg b/3-Basic-Tools-and-Getting-Started/3.9-ROS/images/1-100001302_recomputer_robotics_j3011_with_gmsl_extension.jpg new file mode 100644 index 0000000..796c79c Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.9-ROS/images/1-100001302_recomputer_robotics_j3011_with_gmsl_extension.jpg differ diff --git a/3-Basic-Tools-and-Getting-Started/3.9-ROS/images/fig2.png b/3-Basic-Tools-and-Getting-Started/3.9-ROS/images/fig2.png new file mode 100644 index 0000000..6beda66 Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.9-ROS/images/fig2.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.9-ROS/images/image-20250918154012205.png b/3-Basic-Tools-and-Getting-Started/3.9-ROS/images/image-20250918154012205.png new file mode 100644 index 0000000..c506ac4 Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.9-ROS/images/image-20250918154012205.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.9-ROS/images/image-20250918155614312.png b/3-Basic-Tools-and-Getting-Started/3.9-ROS/images/image-20250918155614312.png new file mode 100644 index 0000000..dd391bd Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.9-ROS/images/image-20250918155614312.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.9-ROS/images/image-20250918161452733.png b/3-Basic-Tools-and-Getting-Started/3.9-ROS/images/image-20250918161452733.png new file mode 100644 index 0000000..ada61b5 Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.9-ROS/images/image-20250918161452733.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.9-ROS/images/image-20250918164335833.png b/3-Basic-Tools-and-Getting-Started/3.9-ROS/images/image-20250918164335833.png new file mode 100644 index 0000000..b97e56c Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.9-ROS/images/image-20250918164335833.png differ diff --git a/3-Basic-Tools-and-Getting-Started/3.9-ROS/images/image-20250918164343330.png b/3-Basic-Tools-and-Getting-Started/3.9-ROS/images/image-20250918164343330.png new file mode 100644 index 0000000..2897bd1 Binary files /dev/null and b/3-Basic-Tools-and-Getting-Started/3.9-ROS/images/image-20250918164343330.png differ diff --git a/3-Basic-Tools-and-Getting-Started/README.MD b/3-Basic-Tools-and-Getting-Started/README.MD new file mode 100644 index 0000000..282f604 --- /dev/null +++ b/3-Basic-Tools-and-Getting-Started/README.MD @@ -0,0 +1,14 @@ +## šŸ“š Table of Basic Tools and Getting Started +| **Chapter** | **Content** | +| :---------: | :----------------------------------------------------------: | +| Module 3.1 | **[Python and Programming Fundamentals](./3.1-Python-and-Programming-Fundamentals/README.md)** | +| Module 3.2 | **[AI and ML](./3.2-AI-and-ML/README.md)** | +| Module 3.3 | **[CUDA](./3.4-CUDA/README.md)** | +| Module 3.4 | **[TensorRT](./3.5-TensorRT/README.md)** | +| Module 3.5 | **[Pytorch](./3.5-Pytorch/README.md)** | +| Module 3.6 | **[Tensorflow](./3.6-Tensorflow/README.md)** | +| Module 3.7 | **[Docker](./3.6-Docker/README.md)** | +| Module 3.8 | **[OpenCV-with-CUDA](./3.8-OpenCV-with-CUDA/README.md)** | +| Module 3.9 | **[ROS](./3.7-ROS/README.md)** | +| Module 3.10 | **[Nomachine](./3.10-Nomachine/README.md)** | + diff --git a/4-Computer-Vision/4.1-Overview-of-Computer-Vision/README.md b/4-Computer-Vision/4.1-Overview-of-Computer-Vision/README.md new file mode 100644 index 0000000..e7fb08e --- /dev/null +++ b/4-Computer-Vision/4.1-Overview-of-Computer-Vision/README.md @@ -0,0 +1,77 @@ +# Overview of Computer Vision + +Computer Vision is a field of Artificial Intelligence (AI) that enables computers to "see." Just as humans use their eyes to perceive the world and then use their brains to understand what they see, computer vision allows computers to "see" images through cameras and then "understand" the content of those images through programming. + +

+ computer-vision +

+ +Computer vision endows machines with human-like "visual capabilities," bringing significant transformation to many fields. It not only improves work efficiency but also helps solve complex problems, such as enhancing the accuracy of medical diagnoses, improving traffic safety, and playing a crucial role in the future of robotics. + + +## How Computer Vision Works + +The working principles of computer vision are quite similar to the way the human brain works. It can be divided into three simple steps: `seeing`, `thinking`, and `acting`. + +- **Seeing (Acquiring Images):** First, the computer needs to acquire images. Just like we use cameras to take photos, computers use cameras to capture pictures or videos. These images represent the world as seen through the computer's "eyes." + +- **Thinking (Analyzing and Understanding):** Next, the computer needs to "think," meaning it needs to understand the content of the images. This step mainly relies on computer programs and algorithms. The program analyzes the pixels in the image and identifies shapes, colors, objects, and other information. For example, when seeing a photo of a dog, the program analyzes the features of the image and determines that it is a dog, not a cat or something else. + +- **Acting (Taking Action):** Finally, the computer takes action based on the information it has understood. This action could be as simple as displaying the result, like telling you what is in the image; or it could be more complex, such as in self-driving cars, where detecting an obstacle ahead may trigger a warning or automatic braking. + +With the rapid advancement of technology, artificial neural networks have become the preferred tools for analyzing and understanding images. More and more people are inclined to use this advanced technology because it can simulate the way the human brain works, automatically learning and extracting useful information from massive amounts of data. This makes machines more intelligent and efficient in handling visual tasks. + +## The History of Computer Vision Development + +The history of computer vision can be divided into several key phases: + +**Initial Stage (1960s-1980s):** Early research in computer vision focused on image processing and pattern recognition, primarily for simple image analysis tasks. + +**Foundational Research Stage (1980s-1990s):** With advances in computer hardware and the emergence of machine learning methods, computer vision expanded to more complex tasks such as face recognition and object detection. + +**Algorithm Innovation Stage (2000s):** The development of feature extraction methods and algorithms like Support Vector Machines (SVM) led to significant progress in specific areas, such as handwritten digit recognition. + +**Deep Learning Stage (2010s-Present):** The application of deep learning, particularly Convolutional Neural Networks (CNNs), significantly improved performance in tasks like image classification, object detection, and image segmentation. This stage saw computer vision technologies being widely applied in areas like autonomous driving and medical imaging. + +**Multimodal Fusion and Expanding Applications (2020s-Future):** Computer vision is increasingly integrating with other technologies, such as natural language processing, fostering new applications like visual question answering and multimodal analysis. + +These stages highlight the evolution of computer vision from simple image processing tools to sophisticated visual understanding systems, continually driving advancements in the field of artificial intelligence. + +## What are the tasks in computer vision? + +

+ computer-vision +

+ +Computer vision encompasses a wide range of tasks, from basic image processing to complex scene understanding. Here are some of the main tasks in computer vision: + +1. Image Classification: The task involves assigning an image to one or more predefined categories. For example, identifying whether an image contains a cat, dog, or other objects. +2. Object Detection: Not only identifies the categories of objects in an image but also determines their locations. The output typically includes bounding boxes and associated labels. For example, detecting and labeling all vehicles, pedestrians, etc., in an image. +3. Image Segmentation: This task involves dividing an image into different regions, usually to separate the foreground from the background. Semantic segmentation assigns each pixel to a category, while instance segmentation further distinguishes between different instances of the same category. +4. Face Recognition: Involves detecting and identifying faces in images or videos, commonly used for authentication and security monitoring. +5. Pose Estimation: Detects key points of a body or object to infer its pose. For example, human pose estimation can identify joint positions to infer a person's posture. +6. Optical Flow Estimation: Computes the motion vector of each pixel in a sequence of images, useful for analyzing dynamic scenes, such as video stabilization and object tracking. +7. 3D Reconstruction: Generates 3D models from 2D images, used in virtual reality, augmented reality, and architectural modeling. +8. Scene Understanding: Involves comprehensive understanding of the entire scene, including object relationships and scene classification. For example, determining whether an image is indoor or outdoor, urban or rural. +9. Visual Question Answering: A system answers questions based on the content of images, combining natural language processing and computer vision. +10. Image Generation: Uses generative models, such as Generative Adversarial Networks (GANs), to create new images, including tasks like image restoration and style transfer. +11. Image Super-Resolution: Generates high-resolution images from low-resolution inputs, aiming to restore fine details as much as possible. +12. Video Analysis: Involves tasks like object detection, tracking, and behavior recognition in videos, used in surveillance, entertainment, and sports analysis. + +These tasks can often be combined, such as in autonomous vehicles, which require a combination of object detection, object tracking, and scene understanding to navigate complex environments. + +## Application Scenarios of Computer Vision +Computer vision has numerous applications in our daily lives. Here are some common examples: + +Face Recognition: On many smartphones, face recognition has become a way to unlock the device. Computer vision technology scans your face and compares it with stored facial data to confirm your identity. + +Autonomous Driving: Autonomous vehicles use computer vision to "see" road conditions. Cameras capture information about the road, pedestrians, vehicles, and traffic signs, helping the car make driving decisions. + +Medical Imaging: In hospitals, computer vision assists doctors in analyzing X-rays, CT scans, and other medical images. It helps in detecting diseases, such as identifying tumors or fractures. + +Intelligent Surveillance: In surveillance systems, computer vision can recognize faces, detect unusual behavior, or identify dangerous situations, which helps enhance public safety. + +Shopping Experience: In some modern stores, computer vision aids in cashless payments and self-checkout. After customers select items, the system automatically recognizes the products and processes the payment. + +## Conclusion +Overall, computer vision is a technology that enables computers to understand and process visual information. Through this technology, we allow machines to "see" and comprehend the world, performing various tasks. This not only changes our way of life but also paves the way for future technological advancements. diff --git a/4-Computer-Vision/4.1-Overview-of-Computer-Vision/images/computer-vision.png b/4-Computer-Vision/4.1-Overview-of-Computer-Vision/images/computer-vision.png new file mode 100755 index 0000000..7f1a201 Binary files /dev/null and b/4-Computer-Vision/4.1-Overview-of-Computer-Vision/images/computer-vision.png differ diff --git a/4-Computer-Vision/4.1-Overview-of-Computer-Vision/images/cv-tasks.gif b/4-Computer-Vision/4.1-Overview-of-Computer-Vision/images/cv-tasks.gif new file mode 100644 index 0000000..9f3ecf8 Binary files /dev/null and b/4-Computer-Vision/4.1-Overview-of-Computer-Vision/images/cv-tasks.gif differ diff --git a/4-Computer-Vision/4.2-Real-time-Video-Processing/README.md b/4-Computer-Vision/4.2-Real-time-Video-Processing/README.md new file mode 100644 index 0000000..bd172ea --- /dev/null +++ b/4-Computer-Vision/4.2-Real-time-Video-Processing/README.md @@ -0,0 +1,173 @@ +# Real-Time Video Processing + +Real-time video processing refers to the technology of immediately processing and analyzing video data after it is captured, providing output with minimal delay. This processing is crucial in various fields such as security monitoring, autonomous driving, smart retail, and real-time sports analysis. The key to real-time video processing lies in the ability to quickly and accurately extract useful information from the video stream and analyze and apply this information. This not only helps improve efficiency but also provides timely data support for decision-making. This article will explore how to use OpenCV and DeepStream, two powerful tools, to achieve efficient real-time video processing. OpenCV offers rich computer vision functionalities, while DeepStream enhances processing speed and accuracy through hardware acceleration and deep learning model optimization. + +## OpenCV + +

+ opencv-python +

+ +OpenCV is a widely used software library for computer vision and image processing. It provides a vast array of algorithms and functions for tasks such as image and video processing, object detection, and machine learning. Additionally, OpenCV offers a wealth of [tutorials and example code](https://docs.opencv.org/4.x/index.html) for beginners, helping them quickly get started and improve their skills. + +### Install OpenCV on Jetson + +OpenCV is pre-installed in the Jetpack operating system. You can check its version using the following command: + +```bash +python3 -c "import cv2; print(cv2.__version__)" +``` + +If the OpenCV version number is returned correctly, it means that OpenCV is already installed on the device. + +### Reading and Displaying Image by OpenCV: + +Create a new Python script, name it `test_python.py`, and enter the following code: + +```python +import cv2 + +# read image +image_gray = cv2.imread("./jetson-examples.png", cv2.IMREAD_GRAYSCALE) +image_color = cv2.imread("./jetson-examples.png", cv2.IMREAD_COLOR) + +# show image +cv2.imshow('gray', image_gray) +cv2.imshow('color', image_color) + +# waitting user input and destroy windows +cv2.waitKey(0) +cv2.destroyAllWindows() +``` +Here's a Python script using OpenCV: +> 1. **import cv2:** Import the OpenCV package. +> 2. **cv2.imread:** Read the image, requiring the image's file path as input. +> 3. **cv2.imshow:** Create a new window on the desktop and display the image. +> 4. **cv2.waitKey(0):** Wait for a keyboard input; otherwise, the program won't proceed to the following code. +> 5. **cv2.destroyAllWindows():** Destroy all windows created by OpenCV. + +Modify the image file path and run the script: +```bash +python3 test_python.py +``` +

+ run-opencv +

+ +## DeepStream +### What is NVIDIA DeepStream? + +NVIDIA’s DeepStream SDK is a complete streaming analytics toolkit based on GStreamer for AI-based multi-sensor processing, video, audio, and image understanding. It’s ideal for vision AI developers, software partners, startups, and OEMs building IVA apps and services. + +

+ deepstream +

+ +You can now create stream-processing pipelines that incorporate neural networks and other complex processing tasks like tracking, video encoding/decoding, and video rendering. These pipelines enable real-time analytics on video, image, and sensor data. + +### Why we need DeepStream + +NVIDIA DeepStream is a powerful SDK that lets you use GPU-accelerated technology to develop end-to-end vision AI pipelines. + +### Get Started with DeepStream SDK + +#### Installing DeepStream + +**Step1.** Prepare the dependency environment. +```bash +pip3 install meson +pip3 install ninja + +pkg-config --modversion glib-2.0 + +sudo apt install libssl3 libssl-dev libgstreamer1.0-0 gstreamer1.0-tools gstreamer1.0-plugins-good gstreamer1.0-plugins-bad gstreamer1.0-plugins-ugly gstreamer1.0-libav libgstreamer-plugins-base1.0-dev libgstrtspserver-1.0-0 libjansson4 libyaml-cpp-dev + +git clone https://github.com/confluentinc/librdkafka.git +cd librdkafka +git checkout tags/v2.2.0 +./configure --enable-ssl +make +sudo make install + +sudo mkdir -p /opt/nvidia/deepstream/deepstream/lib +sudo cp /usr/local/lib/librdkafka* /opt/nvidia/deepstream/deepstream/lib +sudo ldconfig +``` + +**Step2.** [Download](https://catalog.ngc.nvidia.com/orgs/nvidia/resources/deepstream/files) the DeepStream Debian package `deepstream-7.0_7.0.0-1_arm64.deb` to the Jetson device. Enter the following command: + +```bash +sudo apt-get install ./deepstream-7.0_7.0.0-1_arm64.deb +``` + +**Step3.** Considering that we may need to write DeepStream-related programs in Python later, we need to install additional **Python bindings**. + +```bash +# Base dependencies +sudo apt install python3-gi python3-dev python3-gst-1.0 python-gi-dev git meson python3 python3-pip python3.10-dev cmake g++ build-essential libglib2.0-dev libglib2.0-dev-bin libgstreamer1.0-dev libtool m4 autoconf automake libgirepository1.0-dev libcairo2-dev +cd /opt/nvidia/deepstream/deepstream-7.0/sources +sudo git clone https://github.com/NVIDIA-AI-IOT/deepstream_python_apps +sudo git submodule update --init + +# Installing Gst-python +sudo apt-get install -y apt-transport-https ca-certificates -y +sudo update-ca-certificates +cd 3rdparty/gstreamer/subprojects/gst-python/ +sudo meson setup build +cd build +sudo ninja +sudo ninja install + +# Compiling and installing the bindings +cd /opt/nvidia/deepstream/deepstream-7.0/sources/deepstream_python_apps/bindings +sudo mkdir build +cd build +cmake .. -DPYTHON_MAJOR_VERSION=3 -DPYTHON_MINOR_VERSION=10 -DPIP_PLATFORM=linux_aarch64 -DDS_PATH=/opt/nvidia/deepstream/deepstream/ +sudo make -j$(nproc) +pip3 install ./pyds-1.1.11-py3-none-linux_aarch64.whl +``` + +Run `pip3 list | grep pyds` in the terminal. If the terminal outputs the version information for pyds, it means the Python bindings have been successfully installed. + +

+ deepstream +

+ +#### Getting Started + +DeepStream has provided us with a wealth of reference examples. We can directly navigate to the corresponding folder and run these demos. + +```bash +cd /opt/nvidia/deepstream/deepstream-7.0/samples/configs/deepstream-app +deepstream-app -c source30_1080p_dec_infer-resnet_tiled_display_int8.txt +``` +> Note: The first run may take 10 minutes. Please ignore any TensorRT warning messages during the process. + +> Note: If you are accessing the Jetson device remotely via SSH, please enter export `DISPLAY=:0` in the terminal to set the display output. + +

+ deepstream +

+ +If your project requires implementing functionality with Python and DeepStream, these reference examples are exactly what you need: + +```bash +cd /opt/nvidia/deepstream/deepstream-7.0/sources/deepstream_python_apps/apps && ls +``` + +We can quickly run these Python scripts with simple commands, such as: + +```bash +cd /opt/nvidia/deepstream/deepstream-7.0/sources/deepstream_python_apps/apps/deepstream-test1 +python3 deepstream_test_1.py +``` +> Note: If your Python script shows an error importing CUDA, you can fix this issue by using `pip3 install cuda-python`. + + +## More Reference Materials + +| **Tutorial** | **Type** | **Description** | +|:---------:|:---------:|:---------:| +| [DeepStream Official Documentation](https://docs.nvidia.com/metropolis/deepstream/dev-guide/text/DS_Overview.html) | doc | Welcome to the DeepStream Documentation. | +| [gstreamer-1.0](https://valadoc.org/gstreamer-1.0/index.htm) | doc | Powerful framework for creating multimedia applications. | +| [Develop and Optimize Edge AI apps with NVIDIA DeepStream](https://www.nvidia.cn/on-demand/session/gtcspring22-s41777/) | video | Learn how the latest features of DeepStream are making it easier than ever to achieve real-time performance even for complex video AI applications. | diff --git a/4-Computer-Vision/4.2-Real-time-Video-Processing/images/deepstream.png b/4-Computer-Vision/4.2-Real-time-Video-Processing/images/deepstream.png new file mode 100644 index 0000000..f35b37c Binary files /dev/null and b/4-Computer-Vision/4.2-Real-time-Video-Processing/images/deepstream.png differ diff --git a/4-Computer-Vision/4.2-Real-time-Video-Processing/images/example.png b/4-Computer-Vision/4.2-Real-time-Video-Processing/images/example.png new file mode 100644 index 0000000..0524f92 Binary files /dev/null and b/4-Computer-Vision/4.2-Real-time-Video-Processing/images/example.png differ diff --git a/4-Computer-Vision/4.2-Real-time-Video-Processing/images/opencv-python.jpg b/4-Computer-Vision/4.2-Real-time-Video-Processing/images/opencv-python.jpg new file mode 100644 index 0000000..4bc6ba4 Binary files /dev/null and b/4-Computer-Vision/4.2-Real-time-Video-Processing/images/opencv-python.jpg differ diff --git a/4-Computer-Vision/4.2-Real-time-Video-Processing/images/pyds.png b/4-Computer-Vision/4.2-Real-time-Video-Processing/images/pyds.png new file mode 100644 index 0000000..acaf5f8 Binary files /dev/null and b/4-Computer-Vision/4.2-Real-time-Video-Processing/images/pyds.png differ diff --git a/4-Computer-Vision/4.2-Real-time-Video-Processing/images/run-opencv.png b/4-Computer-Vision/4.2-Real-time-Video-Processing/images/run-opencv.png new file mode 100644 index 0000000..bfe86cb Binary files /dev/null and b/4-Computer-Vision/4.2-Real-time-Video-Processing/images/run-opencv.png differ diff --git a/4-Computer-Vision/4.3-Object Detection and Recognition/4.3.1-Train and Deploy YOLOv8 on reComputer/README.md b/4-Computer-Vision/4.3-Object Detection and Recognition/4.3.1-Train and Deploy YOLOv8 on reComputer/README.md new file mode 100644 index 0000000..3bafd7c --- /dev/null +++ b/4-Computer-Vision/4.3-Object Detection and Recognition/4.3.1-Train and Deploy YOLOv8 on reComputer/README.md @@ -0,0 +1,349 @@ +# Train and Deploy YOLOv8 on reComputer + +In this document, we train and deploy a object detection model for traffic scenes on the +[reComputer J4012](https://www.seeedstudio.com/reComputer-J4012-p-5586.html?queryID=f6de8f6c8d814c021e13f4455d041d03&objectID=5586&indexName=bazaar_retailer_products). +This document uses the +[YOLOv8](https://www.ultralytics.com/) +object detection algorithm as an example and provides a detailed overview of the entire process. Please note that all the operations described below take place on the Jetson edge computing device, ensuring that the Jetson device has an operating system installed that is +[JetPack 5.0](https://wiki.seeedstudio.com/NVIDIA_Jetson/) +or above. + + +
+ +
+ + + +## Dataset +The better the quality and quantity of training data, the better the model trained. Therefore, the preparation of the dataset is crucial. There are various methods for collecting training dataset. Here, two methods are introduced: 1. Download pre-annotated open-source public datasets. 2. Collect and annotate training data. Finally, consolidate all the data to prepare for the subsequent training phase. + +### Download public datasets + +There are many platforms where you can freely download datasets, such as +[Roboflow](https://roboflow.com/), +[Kaggle](https://www.kaggle.com/), +and more. Here, we download an annotated dataset related to traffic scenes, +[Traffic Detection Project](https://www.kaggle.com/datasets/yusufberksardoan/traffic-detection-project/download?datasetVersionNumber=1), +from Kaggle. + +The file structure after extraction is as follows: + +```sh +archive +ā”œā”€ā”€ data.yaml +ā”œā”€ā”€ README.dataset.txt +ā”œā”€ā”€ README.roboflow.txt +ā”œā”€ā”€ test +│ ā”œā”€ā”€ images +│ │ ā”œā”€ā”€ aguanambi-1000_png_jpg.rf.7179a0df58ad6448028bc5bc21dca41e.jpg +│ │ ā”œā”€ā”€ aguanambi-1095_png_jpg.rf.4d9f0370f1c09fb2a1d1666b155911e3.jpg +│ │ ā”œā”€ā”€ ... +│ └── labels +│ ā”œā”€ā”€ aguanambi-1000_png_jpg.rf.7179a0df58ad6448028bc5bc21dca41e.txt +│ ā”œā”€ā”€ aguanambi-1095_png_jpg.rf.4d9f0370f1c09fb2a1d1666b155911e3.txt +│ ā”œā”€ā”€ ... +ā”œā”€ā”€ train +│ ā”œā”€ā”€ images +│ │ ā”œā”€ā”€ aguanambi-1000_png_jpg.rf.0ab6f274892b9b370e6441886b2d7b9d.jpg +│ │ ā”œā”€ā”€ aguanambi-1000_png_jpg.rf.dc59d3c5df5d991c1475e5957ea9948c.jpg +│ │ ā”œā”€ā”€ ... +│ └── labels +│ ā”œā”€ā”€ aguanambi-1000_png_jpg.rf.0ab6f274892b9b370e6441886b2d7b9d.txt +│ ā”œā”€ā”€ aguanambi-1000_png_jpg.rf.dc59d3c5df5d991c1475e5957ea9948c.txt +│ ā”œā”€ā”€ ... +└── valid + ā”œā”€ā”€ images + │ ā”œā”€ā”€ aguanambi-1085_png_jpg.rf.0608a42a5c9090a4efaf9567f80fa992.jpg + │ ā”œā”€ā”€ aguanambi-1105_png_jpg.rf.0aa6c5d1769ce60a33d7b51247f2a627.jpg + │ ā”œā”€ā”€ ... + └── labels + ā”œā”€ā”€ aguanambi-1085_png_jpg.rf.0608a42a5c9090a4efaf9567f80fa992.txt + ā”œā”€ā”€ aguanambi-1105_png_jpg.rf.0aa6c5d1769ce60a33d7b51247f2a627.txt + ā”œā”€ā”€... +``` + +Each image has a corresponding text file that contains the complete annotation information for that image. The `data.json` file records the locations of the training, testing, and validation sets, and you need to modify the paths: + +```sh +train: ./train/images +val: ./valid/images +test: ./test/images + +nc: 5 +names: ['bicycle', 'bus', 'car', 'motorbike', 'person'] +``` + + +### Collecting and annotating data + +When public datasets cannot meet user requirements, neet to consider collecting and creating custom datasets tailored to specific needs. This can be achieved by collecting, annotating, and organizing relevant data. +For demonstration purposes, I captured and saved three images from +[YouTube](https://www.youtube.com/watch?v=iJZcjZD0fw0) +, and try to use +[Label Studio](https://www.youtube.com/watch?v=iJZcjZD0fw0) +to annotate the images. + +**Step 1.** Collect raw data: + +
+ +
+ +**Step 2.** Install and run the annotation tool: +```bash +sudo groupadd docker +sudo gpasswd -a ${USER} docker +sudo systemctl restart docker +sudo chmod a+rw /var/run/docker.sock + +mkdir label_studio_data +sudo chmod -R 776 label_studio_data +docker run -it -p 8080:8080 -v $(pwd)/label_studio_data:/label-studio/data heartexlabs/label-studio:latest +``` + +**Step 3.** Create a new project and complete the annotation as per the prompts: +[Label Studio Reference Documentation](https://labelstud.io/blog/quickly-create-datasets-for-training-yolo-object-detection-with-label-studio/#output-the-dataset-in-yolo-format) + +
+ +
+ +After completing the annotation, you can export the dataset in YOLO format and organize the annotated data along with the downloaded data. The simplest approach is to copy all the images to the train/images folder of the public dataset and the generated annotation text files to the train/labels folder of the public dataset. + +At this point, we have obtained the training data through two different methods and integrated them. If you want higher-quality training data, there are many additional steps to consider, such as data cleaning, class balancing, and more. Since our task is relatively simple, we will skip these steps for now and proceed with training using the data obtained above. + +## Model +In this section, we will download the YOLOv8 source code on reComputer and configure the runtime environment. + +**Step 1.** Use the following command to download the source code: + +```bash +git clone https://github.com/ultralytics/ultralytics.git +cd ultralytics +``` + +**Step 2.** Open requirements.txt and modify the relevant content: + +```bash +# Use the `vi` command to open the file +vi requirements.txt + +# Press `a` to enter edit mode, and modify the following content: +torch>=1.7.0 --> # torch>=1.7.0 +torchvision>=0.8.1 --> # torchvision>=0.8.1 + +# Press `ESC` to exit edit mode, and finally input `:wq` to save and exit the file. + +Step 3. Run the following commands to download the required dependencies for YOLO and install YOLOv8: +pip3 install -e . +cd .. +``` + +**Step 4.** Install the Jetson version of PyTorch: + +```bash +sudo apt-get install -y libopenblas-base libopenmpi-dev +wget https://developer.download.nvidia.cn/compute/redist/jp/v512/pytorch/torch-2.1.0a0+41361538.nv23.06-cp38-cp38-linux_aarch64.whl -O torch-2.1.0a0+41361538.nv23.06-cp38-cp38-linux_aarch64.whl +pip3 install torch-2.1.0a0+41361538.nv23.06-cp38-cp38-linux_aarch64.whl +``` + +**Step 5.** Install the corresponding torchvision: +```bash +sudo apt install -y libjpeg-dev zlib1g-dev +git clone --branch v0.16.0 https://github.com/pytorch/vision torchvision +cd torchvision +python3 setup.py install --user +cd .. +``` + +**Step 6.** Use the following command to ensure that YOLO has been successfully installed: +```bash +yolo detect predict model=yolov8s.pt source='https://ultralytics.com/images/bus.jpg' +``` + +## Train +Model training is the process of updating model weights. By training the model, machine learning algorithms can learn from the data to recognize patterns and relationships, enabling predictions and decisions on new data. + +**Step 1.** Create a Python script for training: + +```bash +vi train.py +``` + +Press `a` to enter edit mode, and modify the following content: + +```bash +from ultralytics import YOLO + +# Load a model +model = YOLO('yolov8s.pt') + +# Train the model +results = model.train( + data='/home/nvidia/Everything_Happens_Locally/Dataset/data.yaml', + batch=8, epochs=100, imgsz=640, save_period=5 +) +``` + +Press `ESC` to exit edit mode, and finally input `:wq` to save and exit the file. +The `YOLO.train()` method has many configuration parameters; please refer to the +[documentation](https://docs.ultralytics.com/modes/train/#arguments) +for details. Additionally, you can use a more streamlined `CLI` approach to start training based on your specific requirements. + +**Step 2.** Start training with the following command: +```bash +python3 train.py +``` + +Then comes the lengthy waiting process. Considering the possibility of closing the remote connection window during the wait, this tutorial uses the +[Tmux](https://github.com/tmux/tmux/wiki) +terminal multiplexer. Thus, the interface I see during the process looks like this: + +
+ +
+ +Tmux is optional; as long as the model is training normally. After the training program finishes, you can find the model weight files saved during the training process in the designated folder: + +
+ +
+ +## Validation +The validation process involves using a portion of the data to validate the reliability of the model. This process helps ensure that the model can perform tasks accurately and robustly in real-world applications. If you closely examine the information output during the training process, you'll notice that many validations are interspersed throughout the training. This section won't analyze the meaning of each evaluation metric but will instead analyze the model's usability by examining the prediction results. + +**Step 1.** Use the trained model to infer on a specific image: + +```bash +yolo detect predict \ + model='./runs/detect/train2/weights/best.pt' \ + source='./datas/test/images/ant_sales-2615_png_jpg.rf.0ceaf2af2a89d4080000f35af44d1b03.jpg' \ + save=True show=False +``` + +
+ +
+ +**Step 2.** Examine the inference results. + +From the detection results, it can be observed that the trained model achieves the expected detection performance. + +
+ +
+ +## Deployment +Deployment is the process of applying a trained machine learning or deep learning model to real-world scenarios. The content introduced above has validated the feasibility of the model, but it has not considered the inference efficiency of the model. In the deployment phase, it's necessary to find a balance between detection accuracy and efficiency. TensorRT inference engine can be used to improve the inference speed of the model. + +**Step 1.** To visually demonstrate the contrast between the lightweight and original models, create a new `inference.py` file using the vi tool to implement video file inference. You can replace the inference model and input video by modifying lines 8 and 9. The input in this document is a video I shot with my phone. + +```python +from ultralytics import YOLO +import os +import cv2 +import time +import datetime + + +model = YOLO("/home/nvidia/Everything_Happens_Locally/runs/detect/train2/weights/best.pt") +cap = cv2.VideoCapture('./sample_video.mp4') + +save_dir = os.path.join('runs/inference_test', datetime.datetime.now().strftime('%Y-%m-%d-%H-%M-%S')) +if not os.path.exists(save_dir): + os.makedirs(save_dir) +fourcc = cv2.VideoWriter_fourcc(*'mp4v') +fps = cap.get(cv2.CAP_PROP_FPS) +size = (int(cap.get(cv2.CAP_PROP_FRAME_WIDTH)), int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))) +output = cv2.VideoWriter(os.path.join(save_dir, 'result.mp4'), fourcc, fps, size) + +while cap.isOpened(): + success, frame = cap.read() + if success: + start_time = time.time() + results = model(frame) + annotated_frame = results[0].plot() + total_time = time.time() - start_time + fps = 1/total_time + cv2.rectangle(annotated_frame, (20, 20), (200, 60), (55, 104, 0), -1) + cv2.putText(annotated_frame, f'FPS: {round(fps, 2)}', (30, 50), 0, 0.9, (255, 255, 255), thickness=2, lineType=cv2.LINE_AA) + print(f'FPS: {fps}') + cv2.imshow("YOLOv8 Inference", annotated_frame) + output.write(annotated_frame) + if cv2.waitKey(1) & 0xFF == ord("q"): + break + else: + break + +cv2.destroyAllWindows() +cap.release() +output.release() +``` + +**Step 2.** Run the following command and record the inference speed before model quantization: + +```bash +python3 inference.py +``` + +
+ +
+ +The result indicates that the inference speed of the model before quantization is 21.9 FPS + +**Step 3.** Generate the quantized model: + +```bash +pip3 install onnx +yolo export model=/home/nvidia/Everything_Happens_Locally/runs/detect/train2/weights/best.pt format=engine half=True device=0 +``` + +After the program to complete(about 10-20 minutes), a `.engine` file will be generated in the same directory as the input model. This file is the quantized model. + +
+ +
+ +**Step 4.** Test the inference speed using the quantized model. + +Here, you need to modify the content of line 8 in the script created in Step 1. + +```bash +model = YOLO() --> model = YOLO() +``` + +Then, rerun the inference command: + +```bash +python3 inference.py +``` + +
+ +
+ +From the perspective of inference efficiency, the quantized model shows a significant improvement in inference speed. + +## Summary + +This article provides readers with a comprehensive guide that covers various aspects from data collection and model training to deployment. Importantly, all processes occur in reComputer, eliminating the need for additional GPUs from users. + + +## More Reference Materials + +| **Tutorial** | **Type** | **Description** | +|:---------:|:---------:|:---------:| +| [One-Click Deployment of YOLOv8 ](https://github.com/Seeed-Projects/jetson-examples) | project | One-Click Quick Deployment and Development of Ultralytics YOLOv8. | +| [DeepStream Official Documentation](https://docs.nvidia.com/metropolis/deepstream/dev-guide/text/DS_Overview.html) | doc | Welcome to the DeepStream Documentation. | +| [gstreamer-1.0](https://valadoc.org/gstreamer-1.0/index.htm) | doc | Powerful framework for creating multimedia applications. | +| [Develop and Optimize Edge AI apps with NVIDIA DeepStream](https://www.nvidia.cn/on-demand/session/gtcspring22-s41777/) | video | Learn how the latest features of DeepStream are making it easier than ever to achieve real-time performance even for complex video AI applications. | + + + + diff --git a/4-Computer-Vision/4.3-Object Detection and Recognition/4.3.1-Train and Deploy YOLOv8 on reComputer/images/inference_engine.png b/4-Computer-Vision/4.3-Object Detection and Recognition/4.3.1-Train and Deploy YOLOv8 on reComputer/images/inference_engine.png new file mode 100644 index 0000000..d0490da Binary files /dev/null and b/4-Computer-Vision/4.3-Object Detection and Recognition/4.3.1-Train and Deploy YOLOv8 on reComputer/images/inference_engine.png differ diff --git a/4-Computer-Vision/4.3-Object Detection and Recognition/4.3.1-Train and Deploy YOLOv8 on reComputer/images/inference_pt.png b/4-Computer-Vision/4.3-Object Detection and Recognition/4.3.1-Train and Deploy YOLOv8 on reComputer/images/inference_pt.png new file mode 100644 index 0000000..96c3643 Binary files /dev/null and b/4-Computer-Vision/4.3-Object Detection and Recognition/4.3.1-Train and Deploy YOLOv8 on reComputer/images/inference_pt.png differ diff --git a/4-Computer-Vision/4.3-Object Detection and Recognition/4.3.1-Train and Deploy YOLOv8 on reComputer/images/labeling.png b/4-Computer-Vision/4.3-Object Detection and Recognition/4.3.1-Train and Deploy YOLOv8 on reComputer/images/labeling.png new file mode 100644 index 0000000..530b7f0 Binary files /dev/null and b/4-Computer-Vision/4.3-Object Detection and Recognition/4.3.1-Train and Deploy YOLOv8 on reComputer/images/labeling.png differ diff --git a/4-Computer-Vision/4.3-Object Detection and Recognition/4.3.1-Train and Deploy YOLOv8 on reComputer/images/model_engine.png b/4-Computer-Vision/4.3-Object Detection and Recognition/4.3.1-Train and Deploy YOLOv8 on reComputer/images/model_engine.png new file mode 100644 index 0000000..7b14731 Binary files /dev/null and b/4-Computer-Vision/4.3-Object Detection and Recognition/4.3.1-Train and Deploy YOLOv8 on reComputer/images/model_engine.png differ diff --git a/4-Computer-Vision/4.3-Object Detection and Recognition/4.3.1-Train and Deploy YOLOv8 on reComputer/images/models.png b/4-Computer-Vision/4.3-Object Detection and Recognition/4.3.1-Train and Deploy YOLOv8 on reComputer/images/models.png new file mode 100644 index 0000000..ea4edd4 Binary files /dev/null and b/4-Computer-Vision/4.3-Object Detection and Recognition/4.3.1-Train and Deploy YOLOv8 on reComputer/images/models.png differ diff --git a/4-Computer-Vision/4.3-Object Detection and Recognition/4.3.1-Train and Deploy YOLOv8 on reComputer/images/raw_datas.png b/4-Computer-Vision/4.3-Object Detection and Recognition/4.3.1-Train and Deploy YOLOv8 on reComputer/images/raw_datas.png new file mode 100644 index 0000000..34c0020 Binary files /dev/null and b/4-Computer-Vision/4.3-Object Detection and Recognition/4.3.1-Train and Deploy YOLOv8 on reComputer/images/raw_datas.png differ diff --git a/4-Computer-Vision/4.3-Object Detection and Recognition/4.3.1-Train and Deploy YOLOv8 on reComputer/images/reComputer_J4012.png b/4-Computer-Vision/4.3-Object Detection and Recognition/4.3.1-Train and Deploy YOLOv8 on reComputer/images/reComputer_J4012.png new file mode 100644 index 0000000..9a20505 Binary files /dev/null and b/4-Computer-Vision/4.3-Object Detection and Recognition/4.3.1-Train and Deploy YOLOv8 on reComputer/images/reComputer_J4012.png differ diff --git a/4-Computer-Vision/4.3-Object Detection and Recognition/4.3.1-Train and Deploy YOLOv8 on reComputer/images/training.png b/4-Computer-Vision/4.3-Object Detection and Recognition/4.3.1-Train and Deploy YOLOv8 on reComputer/images/training.png new file mode 100644 index 0000000..825684c Binary files /dev/null and b/4-Computer-Vision/4.3-Object Detection and Recognition/4.3.1-Train and Deploy YOLOv8 on reComputer/images/training.png differ diff --git a/4-Computer-Vision/4.3-Object Detection and Recognition/4.3.2-Deploy YOLOv8 on NVIDIA Jetson using TensorRT and DeepStream SDK Support/README.md b/4-Computer-Vision/4.3-Object Detection and Recognition/4.3.2-Deploy YOLOv8 on NVIDIA Jetson using TensorRT and DeepStream SDK Support/README.md new file mode 100644 index 0000000..9a386c4 --- /dev/null +++ b/4-Computer-Vision/4.3-Object Detection and Recognition/4.3.2-Deploy YOLOv8 on NVIDIA Jetson using TensorRT and DeepStream SDK Support/README.md @@ -0,0 +1,487 @@ + +# Deploy YOLOv8 on NVIDIA Jetson using TensorRT and DeepStream SDK Support + +This guide explains how to deploy a trained AI model into NVIDIA Jetson Platform and perform inference using TensorRT and DeepStream SDK. Here we use TensorRT to maximize the inference performance on the Jetson platform. + +
+ +## Prerequisites + +- Ubuntu Host PC (native or VM using VMware Workstation Player) +- [reComputer Jetson](https://www.seeedstudio.com/reComputer-J4012-p-5586.html) or any other NVIDIA Jetson device running JetPack 4.6 or higher + +## DeepStream Version Corresponsing to JetPack Version + +For YOLOv8 to work together with DeepStream, we are using this [DeepStram-YOLO](https://github.com/marcoslucianops/DeepStream-Yolo) repository and it supports different versions of DeepStream. So make sure to use the correct version of JetPack according to the correct version of DeepStream. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
DeepStream VersionJetPack Version
6.25.1.1
5.1
6.1.15.0.2
6.15.0.1 DP
6.0.14.6.3
4.6.2
4.6.1
6.04.6
+ +To verify this document, we have installed **DeepStream SDK 6.2** on a **JetPack 5.1.1** system running on [reComputer J4012](https://www.seeedstudio.com/reComputer-J4012-p-5586.html). + +## Flash JetPack to Jetson + +Now you need to make sure that the Jetson device is flashed with a [JetPack](https://developer.nvidia.com/embedded/jetpack) system including SDK components such as CUDA, TensorRT, cuDNN and more. You can either use NVIDIA SDK Manager or command-line to flash JetPack to the device. + +For Seeed Jetson-powered devices flashing guides, please refer to the below links: +- [reComputer J1010 | J101](https://wiki.seeedstudio.com/reComputer_J1010_J101_Flash_Jetpack) +- [reComputer J2021 | J202](https://wiki.seeedstudio.com/reComputer_J2021_J202_Flash_Jetpack) +- [reComputer J1020 | A206](https://wiki.seeedstudio.com/reComputer_J1020_A206_Flash_JetPack) +- [reComputer J4012 | J401](https://wiki.seeedstudio.com/reComputer_J4012_Flash_Jetpack) +- [A203 Carrier Board](https://wiki.seeedstudio.com/reComputer_A203_Flash_System) +- [A205 Carrier Board](https://wiki.seeedstudio.com/reComputer_A205_Flash_System) +- [Jetson Xavier AGX H01 Kit](https://wiki.seeedstudio.com/Jetson_Xavier_AGX_H01_Driver_Installation) +- [Jetson AGX Orin 32GB H01 Kit](https://wiki.seeedstudio.com/Jetson_AGX_Orin_32GB_H01_Flash_Jetpack) + +## Install DeepStream + +There are multiple ways of installing DeepStream to the Jetson device. You can follow [this guide](https://docs.nvidia.com/metropolis/deepstream/dev-guide/text/DS_Quickstart.html) to learn more. However, we recommend you to install DeepStream via the SDK Manager because it can guarantee for a successful and easy installation. + +If you install DeepStream using SDK manager, you need to execute the below commands which are additional dependencies for DeepStream, after the system boots up + +```sh +sudo apt install \ +libssl1.1 \ +libgstreamer1.0-0 \ +gstreamer1.0-tools \ +gstreamer1.0-plugins-good \ +gstreamer1.0-plugins-bad \ +gstreamer1.0-plugins-ugly \ +gstreamer1.0-libav \ +libgstreamer-plugins-base1.0-dev \ +libgstrtspserver-1.0-0 \ +libjansson4 \ +libyaml-cpp-dev +``` + +## Install Necessary Packages + +- **Step 1.** Access the terminal of Jetson device, install pip and upgrade it + +```sh +sudo apt update +sudo apt install -y python3-pip +pip3 install --upgrade pip +``` + +- **Step 2.** Clone the following repo + +```sh +git clone https://github.com/ultralytics/ultralytics.git +``` + +- **Step 3.** Open requirements.txt + +```sh +cd ultralytics +vi requirements.txt +``` + +- **Step 4.** Edit the following lines. Here you need to press `i` first to enter editing mode. Press `ESC`, then type `:wq` to save and quit + +```sh +# torch>=1.7.0 +# torchvision>=0.8.1 +``` + +**Note:** torch and torchvision are excluded for now because they will be installed later. + +- **Step 5.** Install the necessary packages + +```sh +pip3 install -r requirements.txt +``` + +If the installer complains about outdated **python-dateutil** package, upgrade it by + +```sh +pip3 install python-dateutil --upgrade +``` + +## Install PyTorch and Torchvision + +We cannot install PyTorch and Torchvision from pip because they are not compatible to run on Jetson platform which is based on **ARM aarch64 architecture**. Therefore we need to manually install pre-built PyTorch pip wheel and compile/ install Torchvision from source. + +Visit [this page](https://forums.developer.nvidia.com/t/pytorch-for-jetson) to access all the PyTorch and Torchvision links. + +Here are some of the versions supported by JetPack 5.0 and above. + +**PyTorch v1.11.0** + +Supported by JetPack 5.0 (L4T R34.1.0) / JetPack 5.0.1 (L4T R34.1.1) / JetPack 5.0.2 (L4T R35.1.0) with Python 3.8 + +**file_name:** torch-1.11.0-cp38-cp38-linux_aarch64.whl +**URL:** https://nvidia.box.com/shared/static/ssf2v7pf5i245fk4i0q926hy4imzs2ph.whl + +**PyTorch v1.12.0** + +Supported by JetPack 5.0 (L4T R34.1.0) / JetPack 5.0.1 (L4T R34.1.1) / JetPack 5.0.2 (L4T R35.1.0) with Python 3.8 + +**file_name:** torch-1.12.0a0+2c916ef.nv22.3-cp38-cp38-linux_aarch64.whl +**URL:** https://developer.download.nvidia.com/compute/redist/jp/v50/pytorch/torch-1.12.0a0+2c916ef.nv22.3-cp38-cp38-linux_aarch64.whl + +- **Step 1.** Install torch according to your JetPack version in the following format + +```sh +wget -O +pip3 install +``` + +For example, here we are running **JP5.0.2** and therefore we choose **PyTorch v1.12.0** + +```sh +sudo apt-get install -y libopenblas-base libopenmpi-dev +wget https://developer.download.nvidia.com/compute/redist/jp/v50/pytorch/torch-1.12.0a0+2c916ef.nv22.3-cp38-cp38-linux_aarch64.whl -O torch-1.12.0a0+2c916ef.nv22.3-cp38-cp38-linux_aarch64.whl +pip3 install torch-1.12.0a0+2c916ef.nv22.3-cp38-cp38-linux_aarch64.whl +``` + +- **Step 2.** Install torchvision depending on the version of PyTorch that you have installed. For example, we chose PyTorch v1.12.0, which means, we need to choose Torchvision v0.13.0 + +```sh +sudo apt install -y libjpeg-dev zlib1g-dev +git clone --branch v0.13.0 https://github.com/pytorch/vision torchvision +cd torchvision +python3 setup.py install --user +``` + +Here is a list of the corresponding torchvision version that you need to install according to the PyTorch version: + +- PyTorch v1.11 - torchvision v0.12.0 +- PyTorch v1.12 - torchvision v0.13.0 + +If you want a more detailed list, please check [this link](https://github.com/pytorch/vision/blob/main/README.rst). + +## DeepStream Configuration for YOLOv8 + +- **Step 1.** Clone the following repo + +```sh +cd ~ +git clone https://github.com/marcoslucianops/DeepStream-Yolo +``` + +- **Step 2.** Checkout the repo to the following commit + +```sh +cd DeepStream-Yolo +git checkout 68f762d5bdeae7ac3458529bfe6fed72714336ca +``` + +- **Step 3.** Copy **gen_wts_yoloV8.py** from **DeepStream-Yolo/utils** into **ultralytics** directory + +```sh +cp utils/gen_wts_yoloV8.py ~/ultralytics +``` + +- **Step 4.** Inside the ultralytics repo, download **pt file** from [YOLOv8 releases](https://github.com/ultralytics/assets/releases/) (example for YOLOv8s) + +```sh +wget https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8s.pt +``` + +**NOTE:** You can use your custom model, but it is important to keep the YOLO model reference **(yolov8_)** in your **cfg** and **weights/wts** filenames to generate the engine correctly. + +- **Step 5.** Generate the cfg, wts and labels.txt (if available) files (example for YOLOv8s) + +```sh +python3 gen_wts_yoloV8.py -w yolov8s.pt +``` + +**Note:** To change the inference size (defaut: 640) + +```sh +-s SIZE +--size SIZE +-s HEIGHT WIDTH +--size HEIGHT WIDTH + +Example for 1280: + +-s 1280 +or +-s 1280 1280 +``` + +- **Step 6.** Copy the generated **cfg**, **wts** and **labels.txt** (if generated) files into the **DeepStream-Yolo** folder + +```sh +cp yolov8s.cfg ~/DeepStream-Yolo +cp yolov8s.wts ~/DeepStream-Yolo +cp labels.txt ~/DeepStream-Yolo +``` + +- **Step 7.** Open the **DeepStream-Yolo** folder and compile the library + +```sh +cd ~/DeepStream-Yolo +CUDA_VER=11.4 make -C nvdsinfer_custom_impl_Yolo # for DeepStream 6.2/ 6.1.1 / 6.1 +CUDA_VER=10.2 make -C nvdsinfer_custom_impl_Yolo # for DeepStream 6.0.1 / 6.0 +``` + +- **Step 8.** Edit the **config_infer_primary_yoloV8.txt** file according to your model (example for YOLOv8s with 80 classes) + +```sh +[property] +... +custom-network-config=yolov8s.cfg +model-file=yolov8s.wts +... +num-detected-classes=80 +... +``` + +- **Step 9.** Edit the **deepstream_app_config.txt** file + +```sh +... +[primary-gie] +... +config-file=config_infer_primary_yoloV8.txt +``` + +- **Step 10.** Change the video source in **deepstream_app_config.txt** file. Here a default video file is loaded as you can see below + +```sh +... +[source0] +... +uri=file:///opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 +``` + +## Run the Inference + +```sh +deepstream-app -c deepstream_app_config.txt +``` + +
+ +The above result is running on Jetson AGX Orin 32GB H01 Kit with FP32 and YOLOv8s 640x640. We can see that the FPS is around 60 and that is not the true FPS because when we set **type=2** under **[sink0]** in **deepstream_app_config.txt** file, the FPS is limited to the fps of the monitor and the monitor we used for this testing is a 60Hz monitor. However, if you change this value to **type=1**, you will be able to obtain the maximum FPS, but there will be no live detection output. + +For the same video source and the same model as used above, after changing **type=1** under **[sink0]**, the below result can be obtained. + +
+ +As you can see, we can get an fps of about 139 which relates to the real fps value. + +## INT8 Calibration + +If you want to use INT8 precision for inference, you need to follow the steps below + +- **Step 1.** Install OpenCV + +```sh +sudo apt-get install libopencv-dev +``` + +- **Step 2.** Compile/recompile the **nvdsinfer_custom_impl_Yolo** library with OpenCV support + +```sh +cd ~/DeepStream-Yolo +CUDA_VER=11.4 OPENCV=1 make -C nvdsinfer_custom_impl_Yolo # for DeepStream 6.2/ 6.1.1 / 6.1 +CUDA_VER=10.2 OPENCV=1 make -C nvdsinfer_custom_impl_Yolo # for DeepStream 6.0.1 / 6.0 +``` + +- **Step 3.** For COCO dataset, download the [val2017](https://drive.google.com/file/d/1gbvfn7mcsGDRZ_luJwtITL-ru2kK99aK/view?usp=sharing), extract, and move to **DeepStream-Yolo** folder + +- **Step 4.** Make a new directory for calibration images + +```sh +mkdir calibration +``` + +- **Step 5.** Run the following to select 1000 random images from COCO dataset to run calibration + +```sh +for jpg in $(ls -1 val2017/*.jpg | sort -R | head -1000); do \ + cp ${jpg} calibration/; \ +done +``` + +**Note:** NVIDIA recommends at least 500 images to get a good accuracy. On this example, 1000 images are chosen to get better accuracy (more images = more accuracy). Higher INT8_CALIB_BATCH_SIZE values will result in more accuracy and faster calibration speed. Set it according to you GPU memory. You can set it from head -1000. For example, for 2000 images, head -2000. This process can take a long time. + +- **Step 6.** Create the **calibration.txt** file with all selected images + +```sh +realpath calibration/*jpg > calibration.txt +``` + +- **Step 7.** Set environment variables + +```sh +export INT8_CALIB_IMG_PATH=calibration.txt +export INT8_CALIB_BATCH_SIZE=1 +``` + +- **Step 8.** Update the **config_infer_primary_yoloV8.txt** file + +From + +```sh +... +model-engine-file=model_b1_gpu0_fp32.engine +#int8-calib-file=calib.table +... +network-mode=0 +... +``` + +To + +```sh +... +model-engine-file=model_b1_gpu0_int8.engine +int8-calib-file=calib.table +... +network-mode=1 +... +``` + +- **Step 9.** Before running the inference, set **type=2** under **[sink0]** in **deepstream_app_config.txt** file as mentioned before to obtain the max fps performance. + +- **Step 10.** Run the inference + +```sh +deepstream-app -c deepstream_app_config.txt +``` + +
+ +Here we get an FPS value of about 350! + +## Multistream Configuration + +NVIDIA DeepStream allows you to easily setup multiple streams on a single configuration file to build multistream video analytics applications. We will demonstrate later in this wiki on how models with high FPS performance can really help with multistream applications along with some benchmarks. + +Here we will take 9 streams as an example. We will be changing the **deepstream_app_config.txt** file. + +- **Step 1.** Inside the **[tiled-display]** section, change the rows and columns to 3 and 3 so that we can have a 3x3 grid with 9 streams + +```sh +[tiled-display] +rows=3 +columns=3 +``` + +- **Step 2.** Inside the **[source0]** section, set **num-sources=9** and add more **uri**. Here we will simply duplicate the current example video file 8 times to make up 9 streams in total. However, you can change to different video streams according to your application + +```sh +[source0] +enable=1 +type=3 +uri=file:///opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 +uri=file:///opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 +uri=file:///opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 +uri=file:///opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 +uri=file:///opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 +uri=file:///opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 +uri=file:///opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 +uri=file:///opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 +uri=file:///opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 +num-sources=9 +``` + +Now if you run the application again with **deepstream-app -c deepstream_app_config.txt** command, you will see the following output + +
+ +## trtexec Tool + +Included in the samples directory is a command-line wrapper tool called [trtexec](https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#trtexec). trtexec is a tool to use TensorRT without having to develop your own application. The trtexec tool has three main purposes: + +- Benchmarking networks on random or user-provided input data. +- Generating serialized engines from models. +- Generating a serialized timing cache from the builder. + +Here we can use trtexec tool to quickly benchmark the models with different parameter. But first of all, you need to have an onnx model and we can genrate this onnx model by using ultralytics yolov8. + +- **Step 1.** Build ONNX using: + +```sh +yolo mode=export model=yolov8s.pt format=onnx +``` + +- **Step 1.** Build engine file using trtexec as follows: + +```sh +cd /usr/src/tensorrt/bin +./trtexec --onnx= --saveEngine= +``` + +For example: + +```sh +./trtexec --onnx=/home/nvidia/yolov8s.onnx --saveEngine=/home/nvidia/yolov8s.engine +``` + +This will output performance results as follows along with a generated **.engine** file. By default it will convert ONNX to an TensorRT optimized file in **FP32** precision and you can see the output as follows + +
+ +Here we can take the mean latency as 7.2ms which translates to 139FPS. This is the same performance we got in the previous DeepStream demo. + +However, if you want **INT8** precision which offers better performance, you can execute the above command as follows + +```sh +./trtexec --onnx=/home/nvidia/yolov8s.onnx --int8 --saveEngine=/home/nvidia/yolov8s.engine +``` + +
+ +Here we can take the mean latency as 3.2ms which translates to 313FPS. + +## YOLOv8 Benchmark Results + +We have done performance benchmarks for different YOLOv8 models running on [reComputer J4012](https://www.seeedstudio.com/reComputer-J4012-p-5586.html), [AGX Orin 32GB H01 Kit](https://www.seeedstudio.com/AGX-Orin-32GB-H01-Kit-p-5569.html) and [reComputer J2021](https://www.seeedstudio.com/reComputer-J2021-p-5438.html) + +
+ +To learn about more performance benchmarks we have done using YOLOv8 models, please check [our blog](https://www.seeedstudio.com/blog/2023/03/30/yolov8-performance-benchmarks-on-nvidia-jetson-devices). + + +## More Reference Materials + +| **Tutorial** | **Type** | **Description** | +|:---------:|:---------:|:---------:| +| [One-Click Deployment of YOLOv8 ](https://github.com/Seeed-Projects/jetson-examples) | project | One-Click Quick Deployment and Development of Ultralytics YOLOv8. | +| [YOLOv8 documentation](https://docs.ultralytics.com) | doc | Welcome to the Ultralytics-yolo Documentation. | +| [TensorRT documentation](https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html) | doc | Welcome to the TensorRT Documentation. | +| [DeepStream SDK documentation](https://docs.nvidia.com/metropolis/deepstream/dev-guide) | doc | Welcome to the DeepStream Documentation. | + diff --git a/4-Computer-Vision/4.3-Object Detection and Recognition/4.3.2-Deploy YOLOv8 on NVIDIA Jetson using TensorRT and DeepStream SDK Support/images/14.png b/4-Computer-Vision/4.3-Object Detection and Recognition/4.3.2-Deploy YOLOv8 on NVIDIA Jetson using TensorRT and DeepStream SDK Support/images/14.png new file mode 100644 index 0000000..33b7984 Binary files /dev/null and b/4-Computer-Vision/4.3-Object Detection and Recognition/4.3.2-Deploy YOLOv8 on NVIDIA Jetson using TensorRT and DeepStream SDK Support/images/14.png differ diff --git a/4-Computer-Vision/4.3-Object Detection and Recognition/4.3.2-Deploy YOLOv8 on NVIDIA Jetson using TensorRT and DeepStream SDK Support/images/2.png b/4-Computer-Vision/4.3-Object Detection and Recognition/4.3.2-Deploy YOLOv8 on NVIDIA Jetson using TensorRT and DeepStream SDK Support/images/2.png new file mode 100644 index 0000000..88ddb3d Binary files /dev/null and b/4-Computer-Vision/4.3-Object Detection and Recognition/4.3.2-Deploy YOLOv8 on NVIDIA Jetson using TensorRT and DeepStream SDK Support/images/2.png differ diff --git a/4-Computer-Vision/4.3-Object Detection and Recognition/4.3.2-Deploy YOLOv8 on NVIDIA Jetson using TensorRT and DeepStream SDK Support/images/7.jpg b/4-Computer-Vision/4.3-Object Detection and Recognition/4.3.2-Deploy YOLOv8 on NVIDIA Jetson using TensorRT and DeepStream SDK Support/images/7.jpg new file mode 100644 index 0000000..a19ccb6 Binary files /dev/null and b/4-Computer-Vision/4.3-Object Detection and Recognition/4.3.2-Deploy YOLOv8 on NVIDIA Jetson using TensorRT and DeepStream SDK Support/images/7.jpg differ diff --git a/4-Computer-Vision/4.3-Object Detection and Recognition/4.3.2-Deploy YOLOv8 on NVIDIA Jetson using TensorRT and DeepStream SDK Support/images/FP32-1.gif b/4-Computer-Vision/4.3-Object Detection and Recognition/4.3.2-Deploy YOLOv8 on NVIDIA Jetson using TensorRT and DeepStream SDK Support/images/FP32-1.gif new file mode 100644 index 0000000..f01700c Binary files /dev/null and b/4-Computer-Vision/4.3-Object Detection and Recognition/4.3.2-Deploy YOLOv8 on NVIDIA Jetson using TensorRT and DeepStream SDK Support/images/FP32-1.gif differ diff --git a/4-Computer-Vision/4.3-Object Detection and Recognition/4.3.2-Deploy YOLOv8 on NVIDIA Jetson using TensorRT and DeepStream SDK Support/images/FP32-no-screen.gif b/4-Computer-Vision/4.3-Object Detection and Recognition/4.3.2-Deploy YOLOv8 on NVIDIA Jetson using TensorRT and DeepStream SDK Support/images/FP32-no-screen.gif new file mode 100644 index 0000000..febd461 Binary files /dev/null and b/4-Computer-Vision/4.3-Object Detection and Recognition/4.3.2-Deploy YOLOv8 on NVIDIA Jetson using TensorRT and DeepStream SDK Support/images/FP32-no-screen.gif differ diff --git a/4-Computer-Vision/4.3-Object Detection and Recognition/4.3.2-Deploy YOLOv8 on NVIDIA Jetson using TensorRT and DeepStream SDK Support/images/car.gif b/4-Computer-Vision/4.3-Object Detection and Recognition/4.3.2-Deploy YOLOv8 on NVIDIA Jetson using TensorRT and DeepStream SDK Support/images/car.gif new file mode 100644 index 0000000..1dfb243 Binary files /dev/null and b/4-Computer-Vision/4.3-Object Detection and Recognition/4.3.2-Deploy YOLOv8 on NVIDIA Jetson using TensorRT and DeepStream SDK Support/images/car.gif differ diff --git a/4-Computer-Vision/4.4-Project Practice-Intelligent Surveillance System/README.md b/4-Computer-Vision/4.4-Project Practice-Intelligent Surveillance System/README.md new file mode 100644 index 0000000..e3fe72f --- /dev/null +++ b/4-Computer-Vision/4.4-Project Practice-Intelligent Surveillance System/README.md @@ -0,0 +1,348 @@ +# AI NVR with reComputer + +## Introduction +With the advancement of artificial intelligence technology, traditional video surveillance systems are evolving towards greater intelligence. AI NVR (Network Video Recorder) combines artificial intelligence with video surveillance technology, enabling not only the recording of video but also real-time analysis, recognition, and processing of video content. This enhances the efficiency and accuracy of security monitoring. This article will introduce how to implement an AI NVR using the NVIDIA Jetson platform. + +This article provides a comprehensive guide on implementing an AI NVR (Network Video Recorder) using the NVIDIA Jetson platform. It covers everything from hardware setup and software installation to configuring DeepStream and VST for real-time video analysis and display on a video wall. + + +
+ +
+ +In this course, we will use [Nvidia VST](https://docs.nvidia.com/mms/text/media-service/VST_Overview.html) and other microservices from the [Jetson Platform Service](https://developer.nvidia.com/embedded/jetpack/jetson-platform-services-get-started) to quickly deploy a local AI NVR on a Jetson device. +Here, we use VST to add cameras, employ the DeepStream pedestrian detection model to detect objects, and display the detection results along with the original video stream on the VST video wall. + +### What is an AI NVR? + +An AI NVR is a device that integrates video recording and artificial intelligence analysis functions. Unlike traditional NVRs, an AI NVR can automatically identify key events in video footage, such as intrusions or missing objects, and even trigger alarms based on predefined rules. This level of intelligence relies on powerful computing capabilities and deep learning algorithms. + +### Why Choose the reServer (NVIDIA Jetson) Platform? + +NVIDIA Jetson is a high-performance, low-power embedded computing platform, making it ideal for AI and deep learning applications. The Jetson platform is equipped with NVIDIA GPUs, which accelerate the deep learning inference process and support a wide range of AI tools and frameworks, such as TensorFlow and PyTorch. + +reServer is an edge computing device based on the Nvidia Jetson platform. It features a compact design, passive cooling, 5x RJ45 GbE with PoE, 2x drive bays for 2.5" HDD/SSD, and a wealth of industrial interfaces, making it an ideal choice for edge AI IoT devices. + +## Prerequisites + +
+ +
+ + + +- Jetson Orin device(with the [jetpack 6.0](https://developer.nvidia.com/embedded/jetson-linux-r363) OS). +- IP Camera. + +In this course, we will accomplish the following tasks using the [reServer Industrial J4012](https://www.seeedstudio.com/reServer-industrial-J4012-p-5747.html), but you can also try using other Jetson devices. + +We can follow the instructions in [this wiki](https://wiki.seeedstudio.com/reServer_Industrial_Getting_Started/#flash-jetpack) to flash the latest JetPack 6.0 system onto the reServer. + +## Getting Started + +### Hardware Connection +- Connect the Jetson device to the network, mouse, keyboard, and monitor. +- Connect the IP Camera to the network. + +Of course, you can also remotely access the Jetson device via SSH over the local network. + +### Step1. Install `nvidia-jetson-services` + +Open the terminal of Jetson device and enter: + +```bash +sudo apt update +sudo apt install nvidia-jetson-services +``` +Then we can find that there are many microservices in `/opt/nvidia/jetson/services/`. + +
+ +
+ + +### Step2. Modify the ingress configuration + +In the `/opt/nvidia/jetson/services/ingress/config/` directory, create a new file named ai-nvr-nginx.conf and fill it with: + +```bash +# specify you service discovery config here + +location /emdx/ { + rewrite ^/emdx/?(.*)$ /$1 break; + proxy_set_header Host $host; + proxy_set_header X-Real-IP $remote_addr; + proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; + access_log /var/log/nginx/access.log timed_combined; + proxy_pass http://emdx_api; +} + +location /ws-emdx/ { + rewrite ^/ws-emdx/?(.*)$ /$1 break; + proxy_set_header Host $host; + proxy_pass http://emdx_websocket; + proxy_http_version 1.1; + proxy_set_header Upgrade $http_upgrade; + proxy_set_header Connection "upgrade"; +} + +``` + +### Step 3. Modify the NVR data storage location (optional) + +Open the file `/opt/nvidia/jetson/services/vst/config/vst_storage.json` and change the directory as needed. + +```bash +{ + "data_path": "/home/seeed/VST/storage/data/", + "video_path": "/home/seeed/VST/storage/video/", + "total_video_storage_size_MB": 10000 +} +``` + +### Step 4. Start the VST service +The VST service depends on other services, so all dependent services need to be started together. + +```bash +sudo systemctl start jetson-redis +sudo systemctl start jetson-ingress +sudo systemctl start jetson-vst +``` + +After the microservices start, the corresponding Docker containers will be created. + +
+ +
+ +Now, we can open the VST web UI in the browser. +In the local network, open the browser and enter: `http://:81/` + + +
+ +
+ + +### Step5. Download the AI NVR configuration file + +Open the browser and go to the [download page](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/jps/resources/reference-workflow-and-resources). + +`Download(Top right corner)` --> `Browser(Diect Download)` + +
+ +
+ +```bash +cd +unzip files.zip +cd files +tar -xvf ai_nvr-1.1.0.tar.gz +cd ai_nvr +``` + +### Step6. Modify the DeepStream configuration file + +We want to be able to see the model's inference results in real-time, so we need to modify DeepStream's input method. Here, we can configure it to output as RTSP. + + +Locate this configuration file and update its contents. + +`/config/deepstream/pn26/service-maker/ds-config-0_nx16.yaml` + +
+ + ds-config-0_nx16.yaml + +```yaml +################################################################################ +# SPDX-FileCopyrightText: Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: MIT +# +# Permission is hereby granted, free of charge, to any person obtaining a +# copy of this software and associated documentation files (the "Software"), +# to deal in the Software without restriction, including without limitation +# the rights to use, copy, modify, merge, publish, distribute, sublicense, +# and/or sell copies of the Software, and to permit persons to whom the +# Software is furnished to do so, subject to the following conditions: +# +# The above copyright notice and this permission notice shall be included in +# all copies or substantial portions of the Software. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL +# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING +# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER +# DEALINGS IN THE SOFTWARE. +################################################################################ + +deepstream: + nodes: + - type: nvinfer + # name of the primary inference must be 'pgie' for test app to route streams here + name: pgie + properties: + config-file-path: "/ds-config-files/pn26/config_infer_primary_RN34_PN26_960x544_dla0_orin_unprune_nx.txt" + model-engine-file: "/pn26-files/dla0_pn26_jp6_halfmem_bs4.engine" + unique-id: 1 + # be sure to rename model-engine-file whenever batch-size is changed + batch-size: 4 + - type: nvtracker + name: tracker + properties: + ll-config-file: "/ds-config-files/pn26/config_tracker_NvDCF_PNv2.6_Interval_1_PVA.yml;/ds-config-files/pn26/config_tracker_NvDCF_PNv2.6_Interval_1_PVA.yml" + ll-lib-file: "/opt/nvidia/deepstream/deepstream/lib/libnvds_nvmultiobjecttracker.so" + sub-batches: "2:2" + tracker-width: 960 + tracker-height: 544 + - type: nvmsgconv + name: msgconv + properties: + payload-type: 1 + - type: nvmsgbroker + name: msgbroker + properties: + config: "/ds-config-files/pn26/cfg_redis.txt" + proto-lib: "/opt/nvidia/deepstream/deepstream/lib/libnvds_redis_proto.so" + conn-str: "localhost;6379;test" + topic: "test" + sync: false + async: false + - type: queue + name: checkpoint + - type: nvmultistreamtiler + name: tiler + properties: + width: 1280 + height: 720 + - type: nvdsosd + name: osd + - type: nvvideoconvert + name: converter + - type: tee + name: tee + - type: queue + name: queue_tracker + - type: queue + name: queue_tee + - type: queue + name: queue_tiler + - type: queue + name: queue_msgconv + - type: queue + name: queue_converter + - type: queue + name: queue_osd + - type: queue + name: queue_sink + - type: queue + name: queue_msgbroker + - type: nvvideoconvert + name: converter1 + - type: nvrtspoutsinkbin + name: sink + properties: + rtsp-port: 8555 + sync: false + - type: sample_video_probe.sample_video_probe + name: osd_counter + properties: + font-size: 15 + edges: + pgie: [queue_tracker, osd_counter] + queue_tracker: tracker + tracker: queue_tee + queue_tee: tee + tee: [queue_tiler, queue_msgconv] + queue_tiler: tiler + tiler: queue_converter + queue_converter: converter + converter: queue_osd + queue_osd: osd + osd: queue_sink + queue_sink: converter1 + converter1: sink + queue_msgconv: msgconv + msgconv: queue_msgbroker + queue_msgbroker: msgbroker +``` + +
+ +Please note the model of your Jetson device. In this case, the Orin NX 16GB module is being used. If you are using a different model, please locate the corresponding configuration file and make the necessary modifications. + + +### Step7. Start the AI NVR application + +In the Jetson terminal, enter the appropriate command to start the AI NVR application. + +```bash +cd /files/ai_nvr + +# Orin AGX: +# sudo docker compose -f compose_agx.yaml up -d --force-recreate +# Orin NX16: +sudo docker compose -f compose_nx16.yaml up -d --force-recreate +# Orin NX8: +# sudo docker compose -f compose_nx8.yaml up -d --force-recreate +# Orin Nano: +# sudo docker compose -f compose_nano.yaml up -d --force-recreate +``` + +During the startup process, the application will create additional Docker containers, such as DeepStream. + +
+ +
+ +### Setp8. Configure the local AI NVR through the web UI + +At this point, we have successfully installed and started the AI NVR application on the Jetson device. +The next step is to configure the camera through the web UI. + +In the local network, open the browser and enter: `http://:30080/vst/` + +Manually configure the IP camera and Deepstream output video stream. + +`Sensor Management` --> `Add device manually` --> `Submit` + +
+ +
+ +Here, we need to enter a valid camera address or the RTSP stream path. The DeepStream output stream is rtsp://192.168.49.161:8555/ds-test. This depends on the DeepStream configuration file, which can be modified according to your needs. + +Once the configuration is successful, you can view all the feeds on the video wall. + +`Video Wall` --> `Select All` --> `Start` + +
+ +
+ + +## References +- https://developer.nvidia.com/embedded/jetpack/jetson-platform-services-get-started + + +## More Reference Materials + +| **Tutorial** | **Type** | **Description** | +|:---------:|:---------:|:---------:| +| [Jetson Platform Services Official Documentation](https://developer.nvidia.com/embedded/jetpack/jetson-platform-services-get-started) | doc | Welcome to the Jetson Platform Services Documentation. | diff --git a/4-Computer-Vision/README.md b/4-Computer-Vision/README.md new file mode 100644 index 0000000..b2cecaa --- /dev/null +++ b/4-Computer-Vision/README.md @@ -0,0 +1,15 @@ +

+ seeed-yolo-banner +

+ +## šŸ“š Table of Computer Vision Applications + +| **Chapter** | **Content** | +|:-----------:|:------------------------------------------------:| +| Module 4.1| [**Overview-of-Computer-Vision**](./4.1-Overview-of-Computer-Vision/README.md)| +| Module 4.2| [**Real-time-Video-Processing**](./4.2-Real-time-Video-Processing/README.md)| +| **Module 4.3**| **Object Detection and Recognition**| +| | [Train and Deploy YOLOv8](./4.3-Object%20Detection%20and%20Recognition/4.3.1-Train%20and%20Deploy%20YOLOv8%20on%20reComputer/README.md)| +| | [Deploy YOLOv8 using TensorRT and DeepStream SDK Support](./4.3-Object%20Detection%20and%20Recognition/4.3.2-Deploy%20YOLOv8%20on%20NVIDIA%20Jetson%20using%20TensorRT%20and%20DeepStream%20SDK%20Support/README.md)| +| **Module 4.4**| [**Project Practice-Intelligent Surveillance System**](./4.4-Project%20Practice-Intelligent%20Surveillance%20System/README.md)| + diff --git a/4-Computer-Vision/Seeed_YOLOv8.jpg b/4-Computer-Vision/Seeed_YOLOv8.jpg new file mode 100755 index 0000000..74aae67 Binary files /dev/null and b/4-Computer-Vision/Seeed_YOLOv8.jpg differ diff --git a/5-Generative-AI/5.1-Introduction-to-Generative-AI/README.md b/5-Generative-AI/5.1-Introduction-to-Generative-AI/README.md new file mode 100644 index 0000000..205fb43 --- /dev/null +++ b/5-Generative-AI/5.1-Introduction-to-Generative-AI/README.md @@ -0,0 +1,94 @@ +# Introduction to Generative AI + +## Introduction +Generative AI is a type of artificial intelligence technology that can create entirely new content based on input. This technology generates new data by learning the distribution of existing data. In this article, we will explore the fundamental concepts, model structures, and application areas of generative AI, and discuss how it differs from traditional discriminative models. + +

+ ai +

+ +## What is Generative AI? + +Generative AI refers to a class of artificial intelligence systems that generate new data by learning the distribution of existing data. Generative AI can create new samples similar to the original data and demonstrates strong generative capabilities in areas such as images, audio, and text. Unlike classification models, generative models focus on creating data rather than classifying or predicting it. + +## Difference Between Generative Models and Discriminative Models +Generative models and discriminative models are two different paradigms in machine learning, with distinct goals and application scenarios. + +- Generative Models learn the probability distribution of data and generate new, unseen samples. Common generative models include Generative Adversarial Networks (GAN), Variational Autoencoders (VAE), and autoregressive models like GPT. These models can generate new images, text, or audio content. +- Discriminative Models learn the decision boundaries of input data to perform classification or regression tasks. Examples of discriminative models include Support Vector Machines (SVM) and logistic regression. The goal of discriminative models is to classify input samples based on existing data, rather than generating new data. + +| Feature | Generative Models | Discriminative Models | +|:---------------:|:-------------------------------------------:|:----------------------------------------:| +| **Main Function** | Generate new data similar to the training data | Classification or prediction | +| **Learning Focus** | Data probability distribution | Decision boundary of the data | +| **Typical Models** | GAN, VAE, Autoregressive models (e.g., GPT) | SVM, Logistic Regression, Neural Network Classifiers | +| **Applications** | Image and text generation, data augmentation, artistic creation | Image classification, sentiment analysis, recommendation systems | + + +The advantage of generative models lies in their ability to not only understand the structure of data but also generate new samples. This gives them significant potential in tasks such as data augmentation, generating artistic works, and data imputation. + +## Main Models and Architectures of Generative AI + +Generative AI encompasses a variety of model architectures, each with its own unique mechanisms and applications. Here are some of the most common generative models: + +### Generative Adversarial Networks (GANs) + +

+ ai +

+ +Image Source: https://aws.amazon.com/what-is/gan/ + +Generative Adversarial Networks (GANs) are one of the most popular models in generative AI. The core idea of GANs is to train a generator and a discriminator through an "adversarial" mechanism. The generator's goal is to produce realistic data, while the discriminator's goal is to distinguish between real data and the fake data generated by the generator. Through this adversarial training, the generator gradually learns to produce more realistic samples. + +### Variational Autoencoders (VAE) + +VAEs are a type of probabilistic generative model. They encode input data into a probability distribution in a latent space (typically a Gaussian distribution), then sample from this distribution and use a decoder to generate new data. The advantage of VAEs is that they can generate smooth and continuous sample spaces, ensuring that the generated data exhibits continuity in the latent space. + +### Diffusion Models (Stable Diffusion) + +

+ ai +

+ +Image Source: https://sushant-kumar.com/blog/ddpm-denoising-diffusion-probabilistic-models + +Diffusion models work by progressively adding noise and learning to reverse the denoising process to transform random noise back into high-quality images, audio, or other data. Due to their multi-stage denoising mechanism, they perform exceptionally well in generative tasks. + +### Transformer-Based Models (Transformer) + +

+ ai +

+ +Image Source: https://www.linkedin.com/pulse/transformer-model-neural-network-which-uses-attention-tejas-bankar/ + +Transformer-based models are currently the most popular generative models. These models use self-attention mechanisms to capture global dependencies between words in a sequence, allowing for parallel processing. This significantly enhances training speed and the ability to handle long-range dependencies. They perform exceptionally well in fields such as natural language processing, image processing, and speech recognition. + +## Main Application Areas of Generative AI + +Generative AI has demonstrated remarkable potential across various fields. Here are some common application scenarios: + +1. **Artistic Creation** + + The application of generative AI in the arts is rapidly growing, particularly in generating images, music, and videos. For example, OpenAI's DALLĀ·E can create unique artistic images based on text descriptions, while platforms like MidJourney allow users to create art using AI tools. These models not only replicate existing styles but also generate entirely new artistic styles, providing new sources of inspiration for artists. + +2. **Healthcare** + + In the healthcare field, generative AI is used to generate medical images, enhance diagnostic data, and more. Especially in cases of rare diseases or data scarcity, generative AI can create simulated case data to help doctors better understand and handle complex cases. Additionally, generative AI is used to predict drug interactions and generate drug molecule structures. + +3. **Commercial Applications** + + In the commercial sector, generative AI is utilized for automating content creation, optimizing advertisements, and personalizing recommendations. For instance, AI can generate product images and write product descriptions for e-commerce platforms, as well as optimize advertising materials to improve conversion rates. + +4. Intelligent Voice Assistants + + Generative AI is widely used in intelligent voice assistants to enhance conversational experiences through natural language processing. Assistants like ChatGPT and Ollama can understand voice commands, generate responses, and help with tasks such as meeting notes, email drafting, and smart home control, significantly improving efficiency. + +## More Reference Materials: +- https://github.com/microsoft/generative-ai-for-beginners +- https://www.mckinsey.com/featured-insights/mckinsey-explainers/what-is-generative-ai +- https://www.nvidia.com/en-us/glossary/generative-ai/ + + + diff --git a/5-Generative-AI/5.1-Introduction-to-Generative-AI/images/AI.png b/5-Generative-AI/5.1-Introduction-to-Generative-AI/images/AI.png new file mode 100644 index 0000000..e1d24f6 Binary files /dev/null and b/5-Generative-AI/5.1-Introduction-to-Generative-AI/images/AI.png differ diff --git a/5-Generative-AI/5.1-Introduction-to-Generative-AI/images/diffusion_model.png b/5-Generative-AI/5.1-Introduction-to-Generative-AI/images/diffusion_model.png new file mode 100644 index 0000000..5c7d6e9 Binary files /dev/null and b/5-Generative-AI/5.1-Introduction-to-Generative-AI/images/diffusion_model.png differ diff --git a/5-Generative-AI/5.1-Introduction-to-Generative-AI/images/gan.jpg b/5-Generative-AI/5.1-Introduction-to-Generative-AI/images/gan.jpg new file mode 100644 index 0000000..d30ff66 Binary files /dev/null and b/5-Generative-AI/5.1-Introduction-to-Generative-AI/images/gan.jpg differ diff --git a/5-Generative-AI/5.1-Introduction-to-Generative-AI/images/transformer.png b/5-Generative-AI/5.1-Introduction-to-Generative-AI/images/transformer.png new file mode 100644 index 0000000..8443ac0 Binary files /dev/null and b/5-Generative-AI/5.1-Introduction-to-Generative-AI/images/transformer.png differ diff --git a/5-Generative-AI/5.1-Introduction-to-Generative-AI/language/README_zh-CN.md b/5-Generative-AI/5.1-Introduction-to-Generative-AI/language/README_zh-CN.md new file mode 100644 index 0000000..753968f --- /dev/null +++ b/5-Generative-AI/5.1-Introduction-to-Generative-AI/language/README_zh-CN.md @@ -0,0 +1,83 @@ +# ē”Ÿęˆå¼ AI 简介 + +## 引言 +ē”Ÿęˆå¼AIę˜Æäø€ē§čƒ½å¤Ÿę ¹ę®č¾“å…„åˆ›é€ å…Øę–°å†…å®¹ēš„äŗŗå·„ę™ŗčƒ½ęŠ€ęœÆļ¼ŒčÆ„ęŠ€ęœÆé€ščæ‡å­¦ä¹ ę•°ę®ēš„åˆ†åøƒę„ē”Ÿęˆę–°ēš„ę•°ę®ć€‚åœØęœ¬ę–‡äø­ļ¼Œęˆ‘ä»¬å°†ę·±å…„ęŽ¢č®Øē”Ÿęˆå¼AIēš„åŸŗęœ¬ę¦‚åæµć€ęØ”åž‹ē»“ęž„ä»„åŠåŗ”ē”Øé¢†åŸŸļ¼Œå¹¶č®Øč®ŗå…¶äøŽä¼ ē»Ÿåˆ¤åˆ«ęØ”åž‹ēš„å·®å¼‚ć€‚ + +

+ ai +

+ +## ä»€ä¹ˆę˜Æē”Ÿęˆå¼AI? +ē”Ÿęˆå¼AI(Generative AIļ¼‰ę˜Æäø€ē±»é€ščæ‡å­¦ä¹ ę•°ę®åˆ†åøƒę„ē”Ÿęˆę–°ę•°ę®ēš„äŗŗå·„ę™ŗčƒ½ē³»ē»Ÿć€‚ē”Ÿęˆå¼AIåÆä»„ē”ŸęˆäøŽåŽŸå§‹ę•°ę®ē±»ä¼¼ēš„ę–°ę ·ęœ¬ļ¼Œå¹¶äø”åœØå›¾åƒć€å£°éŸ³ć€ę–‡ęœ¬ē­‰é¢†åŸŸå±•ēŽ°å‡ŗå¼ŗå¤§ēš„ē”Ÿęˆčƒ½åŠ›ć€‚äøŽåˆ†ē±»ęØ”åž‹äøåŒļ¼Œē”ŸęˆęØ”åž‹å…³ę³Øēš„ę˜Æē”Ÿęˆę•°ę®č€Œéžåˆ†ē±»ęˆ–é¢„ęµ‹ć€‚ + +## ē”ŸęˆęØ”åž‹äøŽåˆ¤åˆ«ęØ”åž‹ēš„åŒŗåˆ« +ē”ŸęˆęØ”åž‹å’Œåˆ¤åˆ«ęØ”åž‹ę˜Æäø¤ē§äøåŒēš„ęœŗå™Øå­¦ä¹ čŒƒå¼ļ¼Œå®ƒä»¬ēš„ē›®ę ‡å’Œåŗ”ē”Øåœŗę™Æęœ‰ę‰€äøåŒć€‚ + +- ē”ŸęˆęØ”åž‹ļ¼ˆGenerative Modelsļ¼‰å­¦ä¹ ę•°ę®ēš„ę¦‚ēŽ‡åˆ†åøƒļ¼Œē”Ÿęˆę–°ēš„ć€ęœŖč§čæ‡ēš„ę ·ęœ¬ć€‚åøøč§ēš„ē”ŸęˆęØ”åž‹åŒ…ę‹¬ē”ŸęˆåÆ¹ęŠ—ē½‘ē»œļ¼ˆGANļ¼‰ć€å˜åˆ†č‡Ŗē¼–ē å™Øļ¼ˆVAEļ¼‰å’Œč‡Ŗå›žå½’ęØ”åž‹ļ¼ˆå¦‚GPTļ¼‰ć€‚å®ƒä»¬čƒ½å¤Ÿē”Ÿęˆę–°ēš„å›¾åƒć€ę–‡ęœ¬ęˆ–éŸ³é¢‘ē­‰å†…å®¹ć€‚ +- åˆ¤åˆ«ęØ”åž‹ļ¼ˆDiscriminative Modelsļ¼‰å­¦ä¹ č¾“å…„ę•°ę®ēš„å†³ē­–č¾¹ē•Œļ¼Œčæ›č”Œåˆ†ē±»ęˆ–å›žå½’ä»»åŠ”ć€‚ä¾‹å¦‚ļ¼Œę”ÆęŒå‘é‡ęœŗļ¼ˆSVMļ¼‰å’Œé€»č¾‘å›žå½’å±žäŗŽåˆ¤åˆ«ęØ”åž‹ć€‚åˆ¤åˆ«ęØ”åž‹ēš„ä»»åŠ”ę˜Æę ¹ę®å·²ęœ‰ę•°ę®åÆ¹č¾“å…„ę ·ęœ¬čæ›č”Œåˆ†ē±»ļ¼Œč€Œäøę˜Æē”Ÿęˆę–°ę•°ę®ć€‚ + +| 特征 | ē”ŸęˆęØ”åž‹ | åˆ¤åˆ«ęØ”åž‹ | +|--------------|-----------------------------------|-------------------------------------| +| **主要功能** | ē”ŸęˆäøŽč®­ē»ƒę•°ę®ē±»ä¼¼ēš„ę–°ę•°ę® | åˆ†ē±»ęˆ–é¢„ęµ‹ | +| **学习对豔** | ę•°ę®ēš„ę¦‚ēŽ‡åˆ†åøƒ | ę•°ę®ēš„å†³ē­–č¾¹ē•Œ | +| **å…øåž‹ęØ”åž‹** | GAN, VAE, č‡Ŗå›žå½’ęØ”åž‹ļ¼ˆå¦‚GPT) | SVM, é€»č¾‘å›žå½’, ē„žē»ē½‘ē»œåˆ†ē±»å™Ø | +| **åŗ”ē”Øåœŗę™Æ** | å›¾åƒć€ę–‡ęœ¬ē”Ÿęˆļ¼Œę•°ę®å¢žå¼ŗļ¼Œč‰ŗęœÆåˆ›ä½œē­‰ | å›¾åƒåˆ†ē±»ļ¼Œęƒ…ę„Ÿåˆ†ęžļ¼ŒęŽØčē³»ē»Ÿē­‰ | + +ē”ŸęˆęØ”åž‹ēš„ä¼˜åŠæåœØäŗŽå®ƒäøä»…åÆä»„ē†č§£ę•°ę®ēš„ē»“ęž„ļ¼Œčæ˜åÆä»„ē”Ÿęˆę–°ēš„ę ·ęœ¬ļ¼Œčæ™ä½æå¾—å®ƒåœØę•°ę®å¢žå¼ŗć€ē”Ÿęˆč‰ŗęœÆä½œå“å’Œę•°ę®å”«å……ē­‰ä»»åŠ”äø­å…·ęœ‰å¾ˆå¤§ēš„ę½œåŠ›ć€‚ + +## ē”Ÿęˆå¼AIēš„äø»č¦ęØ”åž‹äøŽęž¶ęž„ +ē”Ÿęˆå¼AIęœ‰č®øå¤šäøåŒēš„ęØ”åž‹ęž¶ęž„ļ¼ŒęÆē§ęž¶ęž„éƒ½ęœ‰ē‹¬ē‰¹ēš„å·„ä½œęœŗåˆ¶å’Œåŗ”ē”Øåœŗę™Æć€‚ä»„äø‹ę˜Æäø€äŗ›ęœ€åøøč§ēš„ē”ŸęˆęØ”åž‹ļ¼š +### ē”ŸęˆåÆ¹ęŠ—ē½‘ē»œļ¼ˆGAN) + +

+ ai +

+ +å›¾ē‰‡ę„ęŗļ¼šhttps://aws.amazon.com/what-is/gan/ + +ē”ŸęˆåÆ¹ęŠ—ē½‘ē»œę˜Æē”Ÿęˆå¼AIäø­ęœ€å—ę¬¢čæŽēš„ęØ”åž‹ä¹‹äø€ć€‚GANēš„ę øåæƒę€ęƒ³ę˜Æé€ščæ‡ā€œåÆ¹ęŠ—ā€ęœŗåˆ¶ę„č®­ē»ƒē”Ÿęˆå™Øå’Œåˆ¤åˆ«å™Øć€‚ē”Ÿęˆå™Øēš„ē›®ę ‡ę˜Æē”Ÿęˆé€¼ēœŸēš„ę•°ę®ļ¼Œč€Œåˆ¤åˆ«å™Øēš„ē›®ę ‡ę˜ÆåŒŗåˆ†ēœŸå®žę•°ę®äøŽē”Ÿęˆå™Øē”Ÿęˆēš„å‡ę•°ę®ć€‚é€ščæ‡čæ™ē§åÆ¹ęŠ—č®­ē»ƒļ¼Œē”Ÿęˆå™Øé€ęøå­¦ä¼šē”Ÿęˆę›“åŠ é€¼ēœŸēš„ę ·ęœ¬ć€‚ + +### å˜åˆ†č‡Ŗē¼–ē å™Øļ¼ˆVAE) + +VAEę˜Æäø€ē§åŸŗäŗŽę¦‚ēŽ‡ēš„ē”ŸęˆęØ”åž‹ć€‚å®ƒé€ščæ‡å°†č¾“å…„ę•°ę®ē¼–ē ęˆäø€äøŖę½œåœØē©ŗé—“äø­ēš„ę¦‚ēŽ‡åˆ†åøƒļ¼ˆé€šåøøę˜Æé«˜ę–Æåˆ†åøƒļ¼‰ļ¼Œē„¶åŽä»ŽčÆ„åˆ†åøƒäø­é‡‡ę ·ļ¼Œå¹¶é€ščæ‡č§£ē å™Øē”Ÿęˆę–°ę•°ę®ć€‚VAEēš„ä¼˜åŠæåœØäŗŽå®ƒåÆä»„ē”Ÿęˆå¹³ę»‘äø”čæžē»­ēš„ę ·ęœ¬ē©ŗé—“ļ¼Œä½æå¾—ē”Ÿęˆēš„ę•°ę®åœØę½œåœØē©ŗé—“äø­å…·ęœ‰čæžē»­ę€§ć€‚ + +### ę‰©ę•£ęØ”åž‹ļ¼ˆStable Dissusion) + +

+ ai +

+ +å›¾ē‰‡ę„ęŗļ¼šhttps://sushant-kumar.com/blog/ddpm-denoising-diffusion-probabilistic-models + +ę‰©ę•£ęØ”åž‹é€ščæ‡é€ę­„ę·»åŠ å™Ŗå£°å¹¶å­¦ä¹ åå‘åŽ»å™Ŗļ¼Œå°†éšęœŗå™Ŗå£°čæ˜åŽŸäøŗé«˜č“Øé‡ēš„å›¾åƒć€éŸ³é¢‘ē­‰ę•°ę®ļ¼Œå› å…¶å¤šé˜¶ę®µåŽ»å™Ŗęœŗåˆ¶ļ¼ŒåœØē”Ÿęˆä»»åŠ”äø­č”ØēŽ°å‡ŗč‰²ć€‚ + +### åŸŗäŗŽ Transformer ēš„ęØ”åž‹ļ¼ˆTransformer) + +

+ ai +

+ +å›¾ē‰‡ę„ęŗļ¼šhttps://www.linkedin.com/pulse/transformer-model-neural-network-which-uses-attention-tejas-bankar/ + +åŸŗäŗŽ Transformer ēš„ęØ”åž‹ę˜Æē›®å‰ęœ€äøŗęµč”Œēš„ē”ŸęˆęØ”åž‹ļ¼ŒčÆ„ęØ”åž‹é€ščæ‡č‡Ŗę³Øę„åŠ›ęœŗåˆ¶ę•ę‰åŗåˆ—äø­čÆäøŽčÆēš„å…Øå±€ä¾čµ–å…³ē³»ļ¼Œå®žēŽ°å¹¶č”Œå¤„ē†ļ¼Œå¤§å¹…ęå‡č®­ē»ƒé€Ÿåŗ¦å’Œå¤„ē†é•æč·ē¦»ä¾čµ–ēš„čƒ½åŠ›ć€‚å®ƒåœØč‡Ŗē„¶čÆ­čØ€å¤„ē†ć€å›¾åƒå¤„ē†å’ŒčÆ­éŸ³čÆ†åˆ«ē­‰é¢†åŸŸč”ØēŽ°ä¼˜å¼‚ć€‚ + +## ē”Ÿęˆå¼AIēš„äø»č¦åŗ”ē”Øé¢†åŸŸ + +ē”Ÿęˆå¼ AI å·²ē»åœØå¤šäøŖé¢†åŸŸå±•ēŽ°å‡ŗäŗ†éžå‡”ēš„åŗ”ē”Øę½œåŠ›ć€‚ä»„äø‹ę˜Æäø€äŗ›åøøč§ēš„åŗ”ē”Øåœŗę™Æļ¼š + +1. č‰ŗęœÆåˆ›ä½œ +ē”Ÿęˆå¼AIåœØč‰ŗęœÆé¢†åŸŸēš„åŗ”ē”Øę­£åœØčæ…é€Ÿå¢žé•æļ¼Œå°¤å…¶ę˜ÆåœØå›¾åƒć€éŸ³ä¹å’Œč§†é¢‘ē”Ÿęˆę–¹é¢ć€‚ä¾‹å¦‚ļ¼ŒOpenAI ēš„ DALLĀ·E åÆä»„ę ¹ę®ę–‡ęœ¬ęčæ°ē”Ÿęˆē‹¬ē‰¹ēš„č‰ŗęœÆå›¾åƒļ¼ŒMidJourney ē­‰å¹³å°ä¹Ÿå…č®øē”Øęˆ·é€ščæ‡ AI å·„å…·čæ›č”Œåˆ›ä½œć€‚čæ™äŗ›ęØ”åž‹äøä»…åÆä»„å¤åˆ¶ēŽ°ęœ‰é£Žę ¼ļ¼Œčæ˜åÆä»„åˆ›é€ å‡ŗå…Øę–°ēš„č‰ŗęœÆé£Žę ¼ļ¼Œäøŗč‰ŗęœÆå®¶ä»¬ęä¾›äŗ†ę–°ēš„ēµę„Ÿę„ęŗć€‚ +2. åŒ»ē–—é¢†åŸŸ +åœØåŒ»ē–—é¢†åŸŸļ¼Œē”Ÿęˆå¼AIč¢«ē”ØäŗŽē”ŸęˆåŒ»å­¦å½±åƒć€å¢žå¼ŗčÆŠę–­ę•°ę®ē­‰ć€‚å°¤å…¶ę˜ÆåœØē½•č§ē—…ē—‡ęˆ–ę•°ę®äøč¶³ēš„ęƒ…å†µäø‹ļ¼Œē”Ÿęˆå¼AIåÆä»„ē”ŸęˆęØ”ę‹Ÿē—…ä¾‹ę•°ę®ļ¼Œåø®åŠ©åŒ»ē”Ÿę›“å„½åœ°ē†č§£å’Œå¤„ē†å¤ę‚ēš„ē—…ä¾‹ć€‚ę­¤å¤–ļ¼Œē”Ÿęˆå¼ AI ä¹Ÿč¢«ē”Øę„é¢„ęµ‹čÆē‰©ē›øäŗ’ä½œē”Øļ¼Œē”ŸęˆčÆē‰©åˆ†å­ē»“ęž„ē­‰ć€‚ +3. å•†äøšåŗ”ē”Ø +åœØå•†äøšé¢†åŸŸļ¼Œē”Ÿęˆå¼AIē”ØäŗŽč‡ŖåŠØåŒ–å†…å®¹ē”Ÿęˆć€å¹æå‘Šä¼˜åŒ–å’ŒäøŖę€§åŒ–ęŽØčē­‰ä»»åŠ”ć€‚ęÆ”å¦‚ļ¼ŒAIåÆä»„äøŗē”µå•†å¹³å°ē”Ÿęˆäŗ§å“å›¾ē‰‡ć€å†™ä½œäŗ§å“ęčæ°ļ¼Œē”šč‡³ä¼˜åŒ–å¹æå‘Šē“ ęä»„ęå‡č½¬åŒ–ēŽ‡ć€‚ +4. ę™ŗčƒ½čÆ­éŸ³åŠ©ę‰‹ +ē”Ÿęˆå¼AIå¹æę³›åŗ”ē”ØäŗŽę™ŗčƒ½čÆ­éŸ³åŠ©ę‰‹ļ¼Œé€ščæ‡č‡Ŗē„¶čÆ­čØ€å¤„ē†ęå‡åÆ¹čÆä½“éŖŒć€‚åƒChatGPT态Ollamaē­‰åŠ©ę‰‹čƒ½ē†č§£čÆ­éŸ³ęŒ‡ä»¤ć€ē”Ÿęˆå›žē­”ļ¼Œåø®åŠ©å®Œęˆä»»åŠ”ļ¼Œå¦‚ä¼šč®®č®°å½•ć€é‚®ä»¶ę’°å†™å’Œę™ŗčƒ½å®¶å±…ęŽ§åˆ¶ē­‰ļ¼Œå¤§å¤§ęå‡äŗ†ę•ˆēŽ‡ć€‚ + +ę›“å¤šå‚č€ƒå†…å®¹: +- https://github.com/microsoft/generative-ai-for-beginners +- https://www.mckinsey.com/featured-insights/mckinsey-explainers/what-is-generative-ai +- https://www.nvidia.com/en-us/glossary/generative-ai/ + + + diff --git a/5-Generative-AI/5.2-Generative-Adversarial-Network/README.md b/5-Generative-AI/5.2-Generative-Adversarial-Network/README.md new file mode 100644 index 0000000..11723d8 --- /dev/null +++ b/5-Generative-AI/5.2-Generative-Adversarial-Network/README.md @@ -0,0 +1,3 @@ +# Generative Adversarial Networks + +Generative Adversarial Networks (GANs) are a type of deep learning model composed of two parts: a generator and a discriminator. The generator is responsible for producing data that appears realistic, attempting to "fool" the discriminator, while the discriminator is tasked with distinguishing between the generated data and real data. The two networks compete with each other and continually improve: the generator aims to create more realistic data, while the discriminator works to enhance its ability to identify generated data. Ultimately, the generator can produce results that are very similar to real data, and GANs have wide applications in areas such as image generation and data augmentation. diff --git a/5-Generative-AI/5.3-Variational-Autoencoders/README.md b/5-Generative-AI/5.3-Variational-Autoencoders/README.md new file mode 100644 index 0000000..20ab826 --- /dev/null +++ b/5-Generative-AI/5.3-Variational-Autoencoders/README.md @@ -0,0 +1,3 @@ +# Variational Autoencoders + +Variational Autoencoders (VAE) are a type of generative model used to learn the probability distribution of complex data. VAE compresses the input data into a probability distribution of latent variables through an encoder, and then samples from that distribution using a decoder to generate new data that resembles the input. Unlike traditional autoencoders, VAE optimizes the reconstruction error of the data and the similarity of the latent variable distribution during training, allowing the model to generate diverse and continuous data. VAE is widely used in applications such as image generation and anomaly detection. diff --git a/5-Generative-AI/5.4-Autoregressive-Models-and-Generative-Text-Models/README.md b/5-Generative-AI/5.4-Autoregressive-Models-and-Generative-Text-Models/README.md new file mode 100644 index 0000000..8cb5694 --- /dev/null +++ b/5-Generative-AI/5.4-Autoregressive-Models-and-Generative-Text-Models/README.md @@ -0,0 +1,3 @@ +# Autoregressive Models and Generative Text Models + +Autoregressive models are a class of generative models that generate sequential data by recursively predicting the next value. Each output at each step depends on all previous inputs or previously generated outputs, making them commonly used in time series and natural language processing tasks. Generative text models are a typical application of autoregressive models, predicting and generating the next word based on previously generated words, gradually constructing entire text passages. Notable generative text models, such as the GPT series, use autoregressive mechanisms to generate coherent and fluent text, which is utilized in dialogue systems, text generation, and other tasks. diff --git a/5-Generative-AI/5.5-The-Evolution-of-Transformer-Models-and-Generative-AI/README.md b/5-Generative-AI/5.5-The-Evolution-of-Transformer-Models-and-Generative-AI/README.md new file mode 100644 index 0000000..8dc5a9e --- /dev/null +++ b/5-Generative-AI/5.5-The-Evolution-of-Transformer-Models-and-Generative-AI/README.md @@ -0,0 +1,167 @@ +# The Evolution of Transformer Models and Generative AI + +## Introduction + +The emergence of Transformer models represents a significant breakthrough in natural language processing (NLP) and generative tasks. By employing an innovative Attention mechanism, Transformers address the limitations of traditional RNNs (Recurrent Neural Networks) and LSTMs (Long Short-Term Memory networks) when handling long sequences. This advancement not only enhances the accuracy of language understanding but also achieves remarkable success in generative tasks. Whether it’s text translation, summarization, or dialogue systems, Transformer models and their derivative pretrained models like ChatGPT, Llama, and T5 have laid a solid foundation for generative AI. + +In this chapter, we will delve into the architecture of Transformers, the advantages of the Attention mechanism, and the applications of these models in generative tasks. We will also analyze and improve pretrained models through experimental exercises to solve specific tasks, ultimately mastering how to fine-tune models to enhance performance. + +## Principles of Transformer Architecture + +### Limitations of Traditional Sequence Models + +Before Transformers, RNNs and LSTMs were the mainstream models in NLP tasks. However, they exhibited significant limitations when processing long sequences: + +- Long-Distance Dependency Issues: RNNs and LSTMs struggle to capture long-distance dependencies as information tends to diminish over time steps, making it difficult to retain relevant context. + +- Low Computational Efficiency: These models process sequences step-by-step, which prevents parallel execution, leading to lower computational efficiency. + +### Innovations of Transformer + +The Transformer model, proposed by Vaswani et al. in 2017, is entirely based on the Attention mechanism, eliminating the sequential dependency and enabling efficient performance when handling long sequences. + +

+ transformer architecture +

+ +Image source: https://transformers.run/c1/transformer/ + +The Transformer consists of multiple encoders (Encoder) and decoders (Decoder), each of which contains two main components: + +1. Multi-Head Self-Attention mechanism: This allows the model to pay attention to all other words in the input sequence when processing a particular word, capturing long-distance dependencies. +2. Feedforward Neural Network: This is used for further processing of the output from the attention mechanism. This architecture enables highly parallelized processing of sequence data, greatly improving computational efficiency compared to RNN/LSTM. + +## Introduction and Advantages of the Attention Mechanism + +### The Basic Idea of the Attention Mechanism + +The introduction of the Attention mechanism addresses the issue that traditional models struggle to capture global information when processing long sequences. Attention dynamically adjusts the focus of the model by calculating the relevance of each input word to all other words. Specifically, the Attention mechanism allocates different weights to different words based on the similarity between the query (Query), key (Key), and value (Value), thereby enhancing the model’s focus on key contextual information. + +### Self-Attention + +Self-Attention is the core of the Transformer. In the self-attention mechanism, each word in the input sequence pays attention not only to the words around it but also establishes associations with all other words in the entire sequence. This mechanism allows the model to capture global context information simultaneously, regardless of the distance between words. + +### Multi-Head Attention + +The multi-head attention mechanism allows the model to perform multiple self-attention calculations in different subspaces and concatenate the results. This enables the model to capture more semantic relationships across different dimensions, making the Transformer more flexible and powerful when dealing with complex tasks. + +### Advantages of the Attention Mechanism + +- **Capturing Long-Distance Dependencies:** The Attention mechanism can consider all words in the entire sequence in a single operation, effectively solving the long-distance dependency problem of traditional sequence models. +- **Parallel Computation:** The Attention mechanism does not rely on step-by-step sequence processing, which can significantly improve computational efficiency. +- **Flexibility:** The Attention mechanism can dynamically adjust the focus of the model, adapting to the needs of different tasks. + +## Applications of Large-Scale Pretrained Models + +The Transformer architecture has provided a solid foundation for the emergence of large-scale pretrained models. These models, trained on vast amounts of data and then fine-tuned for specific tasks, have significantly improved the performance of generation tasks. Moreover, Transformer models have not only achieved tremendous success in natural language processing and generation tasks but are also widely applied in other domains. Here are some application scenarios for Transformer models: + +### Dialogue Systems and Chatbots + +

+ anythingllm +

+ +- **Smart Customer Service:** Generative dialogue systems can produce natural conversations based on user input, enabling businesses to offer 24/7 customer support services. Models like ChatGPT and Llama can handle common inquiries and provide real-time feedback. +- **Virtual Assistants:** Virtual assistants such as Siri, Alexa, and Google Assistant utilize language generation technology to produce voice or text responses, helping users complete tasks or provide information. + +### Image Processing and Computer Vision + +

+ anythingllm +

+ +Image Source: https://www.jetson-ai-lab.com/vit/tutorial_sam.html + +- **Vision Transformer (ViT):** ViT is the application of Transformer in computer vision, where the image is directly divided into patches, and these patches are treated as a sequence input for the Transformer to perform image classification. Compared to traditional Convolutional Neural Networks (CNNs), ViT demonstrates stronger performance on large-scale datasets. + +- **Image Generation:** Transformers can also be used for image generation tasks, by modeling the sequence of image pixels to produce high-quality images. + +### Speech Processing + +

+ anythingllm +

+ +- **Speech Recognition:** Transformers have made significant advancements in speech recognition tasks. Models like Conformer, which combine the advantages of convolution and Transformer, perform well in real-time speech-to-text (ASR) systems. +- **Speech Generation:** Transformers are also widely used in text-to-speech (TTS) tasks. Models such as Tacotron 2, which integrate Attention mechanisms with RNN/Transformer, provide high-quality solutions for generating natural-sounding speech synthesis. + +### Time Series Forecasting + +- **Financial Market Analysis and Forecasting:** Transformers are used to process time series data in financial markets. The model can capture long-distance temporal dependencies through the Attention mechanism, leading to more accurate predictions of stock prices and market trends. +- **Energy Consumption Forecasting:** In energy management, Transformers are employed to predict future energy demands. By analyzing long-term trends in energy consumption data, they help optimize energy distribution and scheduling. + +### Robot Control and Reinforcement Learning + +

+ anythingllm +

+ +Image Source: https://www.jetson-ai-lab.com/lerobot.html + +- **Robot Path Planning:** Transformers are applied in robot control tasks for path planning and task decision-making. By modeling the sequence of the robot’s states, they optimize the effectiveness of path planning. +- **Game AI:** In reinforcement learning tasks, Transformers are used to handle complex gaming environments, helping to train more intelligent game AI, such as excelling in board games and real-time strategy games. + +### Drug Discovery + +- Compound Generation and Optimization: Transformer models are used to generate and optimize the structures of chemical compounds, aiding in the discovery of potential drug molecules and improving the efficiency of drug design. + + +Through these examples, it is evident that the flexibility and powerful capabilities of Transformer models extend beyond natural language processing and are widely applied across various tasks and fields, driving technological advancements in each domain. + + +## Experiment: Analyzing and Improving Pretrained Models for Specific Tasks + +In the following experiment, we will learn how to load a pretrained model and fine-tune it to complete specific generation tasks. Here, we demonstrate how to adapt an English chat model (Phi-1.5) to support Chinese. + +

+ llama-factory +

+ +### Step 1: Configure the Training Environment + +Choose a hardware device. Here, I am using the Nvidia Jetson AGX Orin 64GB, but you can also use a device with lower memory, such as the Jetson Orin NX 16GB. Then, use jetson-examples to install llama-factory. + +> - **Nvidia Jetson AGX Orin 64GB:** A high-performance edge computing device +> - **jetson-examples:** A tool for quickly deploying popular projects to Jetson devices +> - **llama-factory:** A tool for easily training models + +Open the terminal on your Jetson device and execute the following: + +```bash +pip3 install jetson-examples +sudo reboot +reComputer run llama-factory +``` + +If all commands execute successfully, we can then use our browser to access the llama-factory WebUI. + +```bash +# http://:7860 +http://127.0.0.1:7860 +``` +### Step 2: Start Training + +Configure the pre-trained model and Chinese dataset in the WebUI, and then initiate the training. + +

+ train +

+ +### Step 3: Effectiveness Testing +Wait for the model to finish training. We can use the llama-factory tool to load the fine-tuned model and test its effectiveness. As seen in the screenshot below, the fine-tuned model has acquired the capability to generate Chinese text. + +

+ test + test +

+ + +> Note: For more detailed experimental content, please visit https://wiki.seeedstudio.com/Finetune_LLM_on_Jetson/ + +## Additional Reference Materials + +- https://github.com/hiyouga/LLaMA-Factory +- https://github.com/Seeed-Projects/jetson-examples + + + diff --git a/5-Generative-AI/5.5-The-Evolution-of-Transformer-Models-and-Generative-AI/images/ViT.png b/5-Generative-AI/5.5-The-Evolution-of-Transformer-Models-and-Generative-AI/images/ViT.png new file mode 100644 index 0000000..79c061b Binary files /dev/null and b/5-Generative-AI/5.5-The-Evolution-of-Transformer-Models-and-Generative-AI/images/ViT.png differ diff --git a/5-Generative-AI/5.5-The-Evolution-of-Transformer-Models-and-Generative-AI/images/anythingllm.png b/5-Generative-AI/5.5-The-Evolution-of-Transformer-Models-and-Generative-AI/images/anythingllm.png new file mode 100644 index 0000000..3856f51 Binary files /dev/null and b/5-Generative-AI/5.5-The-Evolution-of-Transformer-Models-and-Generative-AI/images/anythingllm.png differ diff --git a/5-Generative-AI/5.5-The-Evolution-of-Transformer-Models-and-Generative-AI/images/llama-factory.gif b/5-Generative-AI/5.5-The-Evolution-of-Transformer-Models-and-Generative-AI/images/llama-factory.gif new file mode 100644 index 0000000..ad4600d Binary files /dev/null and b/5-Generative-AI/5.5-The-Evolution-of-Transformer-Models-and-Generative-AI/images/llama-factory.gif differ diff --git a/5-Generative-AI/5.5-The-Evolution-of-Transformer-Models-and-Generative-AI/images/robot.png b/5-Generative-AI/5.5-The-Evolution-of-Transformer-Models-and-Generative-AI/images/robot.png new file mode 100644 index 0000000..83f25ad Binary files /dev/null and b/5-Generative-AI/5.5-The-Evolution-of-Transformer-Models-and-Generative-AI/images/robot.png differ diff --git a/5-Generative-AI/5.5-The-Evolution-of-Transformer-Models-and-Generative-AI/images/test1.png b/5-Generative-AI/5.5-The-Evolution-of-Transformer-Models-and-Generative-AI/images/test1.png new file mode 100644 index 0000000..a910c8a Binary files /dev/null and b/5-Generative-AI/5.5-The-Evolution-of-Transformer-Models-and-Generative-AI/images/test1.png differ diff --git a/5-Generative-AI/5.5-The-Evolution-of-Transformer-Models-and-Generative-AI/images/test2.png b/5-Generative-AI/5.5-The-Evolution-of-Transformer-Models-and-Generative-AI/images/test2.png new file mode 100644 index 0000000..701ace3 Binary files /dev/null and b/5-Generative-AI/5.5-The-Evolution-of-Transformer-Models-and-Generative-AI/images/test2.png differ diff --git a/5-Generative-AI/5.5-The-Evolution-of-Transformer-Models-and-Generative-AI/images/train.png b/5-Generative-AI/5.5-The-Evolution-of-Transformer-Models-and-Generative-AI/images/train.png new file mode 100644 index 0000000..caa9848 Binary files /dev/null and b/5-Generative-AI/5.5-The-Evolution-of-Transformer-Models-and-Generative-AI/images/train.png differ diff --git a/5-Generative-AI/5.5-The-Evolution-of-Transformer-Models-and-Generative-AI/images/transformer.png b/5-Generative-AI/5.5-The-Evolution-of-Transformer-Models-and-Generative-AI/images/transformer.png new file mode 100644 index 0000000..c84fddb Binary files /dev/null and b/5-Generative-AI/5.5-The-Evolution-of-Transformer-Models-and-Generative-AI/images/transformer.png differ diff --git a/5-Generative-AI/5.5-The-Evolution-of-Transformer-Models-and-Generative-AI/images/video.jpeg b/5-Generative-AI/5.5-The-Evolution-of-Transformer-Models-and-Generative-AI/images/video.jpeg new file mode 100644 index 0000000..18beb00 Binary files /dev/null and b/5-Generative-AI/5.5-The-Evolution-of-Transformer-Models-and-Generative-AI/images/video.jpeg differ diff --git a/5-Generative-AI/5.5-The-Evolution-of-Transformer-Models-and-Generative-AI/language/README_zh-CN.md b/5-Generative-AI/5.5-The-Evolution-of-Transformer-Models-and-Generative-AI/language/README_zh-CN.md new file mode 100644 index 0000000..5417362 --- /dev/null +++ b/5-Generative-AI/5.5-The-Evolution-of-Transformer-Models-and-Generative-AI/language/README_zh-CN.md @@ -0,0 +1,160 @@ +# Transformer ęØ”åž‹äøŽē”Ÿęˆå¼ AI ēš„čæ›åŒ– + +## 引言 +åœØč‡Ŗē„¶čÆ­čØ€å¤„ē†ļ¼ˆNLPļ¼‰å’Œē”Ÿęˆä»»åŠ”äø­ļ¼ŒTransformer ęØ”åž‹ēš„å‡ŗēŽ°ę ‡åæ—ē€äø€äøŖå·Øå¤§ēš„ēŖē “ć€‚å®ƒé€ščæ‡åˆ›ę–°ēš„ Attention ęœŗåˆ¶č§£å†³äŗ†ä¼ ē»Ÿ RNN 和 LSTM ęØ”åž‹åœØå¤„ē†é•æåŗåˆ—ę—¶ēš„å±€é™ę€§ļ¼Œäøä»…ęå‡äŗ†čÆ­čØ€ē†č§£ēš„ē²¾åŗ¦ļ¼Œä¹ŸåœØē”Ÿęˆä»»åŠ”äø­å–å¾—äŗ†ę˜¾č‘—ēš„ęˆåŠŸć€‚ę— č®ŗę˜Æę–‡ęœ¬ēæ»čÆ‘ć€ę‘˜č¦ē”Ÿęˆļ¼Œčæ˜ę˜ÆåÆ¹čÆē³»ē»Ÿļ¼ŒTransformer ęØ”åž‹å’Œå…¶č”ē”Ÿēš„é¢„č®­ē»ƒęØ”åž‹å¦‚ChatGPT态Llama态T5ē­‰ļ¼Œéƒ½äøŗē”Ÿęˆå¼AIå„ å®šäŗ†åšå®žēš„åŸŗē”€ć€‚ + +åœØčæ™äø€ē« äø­ļ¼Œęˆ‘ä»¬å°†ę·±å…„äŗ†č§£ Transformer ēš„ęž¶ęž„ļ¼ŒAttention ęœŗåˆ¶ēš„ä¼˜åŠæļ¼Œä»„åŠčæ™äŗ›ęØ”åž‹åœØē”Ÿęˆä»»åŠ”äø­ēš„åŗ”ē”Øć€‚ä½ čæ˜å°†é€ščæ‡å®žéŖŒčÆ¾åˆ†ęžå¹¶ę”¹čæ›é¢„č®­ē»ƒęØ”åž‹ę„č§£å†³ē‰¹å®šä»»åŠ”ļ¼Œęœ€ē»ˆęŽŒę”å¦‚ä½•å¾®č°ƒęØ”åž‹ä»„ęå‡ę€§čƒ½ć€‚ + +## Transformer ęž¶ęž„ēš„åŽŸē† + +### ä¼ ē»Ÿåŗåˆ—ęØ”åž‹ēš„å±€é™ę€§ +在 Transformer ä¹‹å‰ļ¼ŒRNNļ¼ˆå¾ŖēŽÆē„žē»ē½‘ē»œļ¼‰å’Œ LSTMļ¼ˆé•æēŸ­ęœŸč®°åæ†ē½‘ē»œļ¼‰ę˜ÆNLPä»»åŠ”äø­ēš„äø»ęµęØ”åž‹ć€‚ē„¶č€Œļ¼Œå®ƒä»¬åœØå¤„ē†é•æåŗåˆ—ę—¶č”ØēŽ°å‡ŗę˜¾č‘—ēš„å±€é™ę€§ļ¼š +- é•æč·ē¦»ä¾čµ–é—®é¢˜ļ¼šRNN和LSTMåœØå¤„ē†é•æåŗåˆ—ę—¶ļ¼Œäæ”ęÆéšē€ę—¶é—“ę­„é•æå¢žåŠ č€Œé€ęøč”°å‡ļ¼Œéš¾ä»„ę•ę‰åˆ°čæœč·ē¦»ēš„ä¾čµ–å…³ē³»ć€‚ +- č®”ē®—ę•ˆēŽ‡ä½Žļ¼šē”±äŗŽčæ™äŗ›ęØ”åž‹éœ€č¦é€ę­„å¤„ē†åŗåˆ—ļ¼Œę— ę³•å¹¶č”Œę‰§č”Œļ¼ŒåÆ¼č‡“č®”ē®—ę•ˆēŽ‡č¾ƒä½Žć€‚ + +### Transformerēš„åˆ›ę–° + +Transformer ęØ”åž‹åœØ 2017 幓由 Vaswani ē­‰äŗŗęå‡ŗļ¼Œå®ƒēš„č®¾č®”å®Œå…ØåŸŗäŗŽ Attention ęœŗåˆ¶ļ¼Œę‘†č„±äŗ†åŗåˆ—åŒ–ēš„ä¾čµ–ļ¼ŒåÆä»„åœØå¤„ē†é•æåŗåˆ—ę—¶äæęŒé«˜ę•ˆēš„ę€§čƒ½ć€‚ + +

+ transformer architecture +

+ +å›¾ē‰‡ę„ęŗļ¼šhttps://transformers.run/c1/transformer/ + +Transformer ē”±å¤šäøŖē¼–ē å™Øļ¼ˆEncoderļ¼‰å’Œč§£ē å™Øļ¼ˆDecoderļ¼‰ē»„ęˆļ¼ŒęÆäøŖē¼–ē å™Øå’Œč§£ē å™Øéƒ½åŒ…å«äø¤äøŖäø»č¦éƒØåˆ†ļ¼š + +1. å¤šå¤“č‡Ŗę³Øę„åŠ›ęœŗåˆ¶ļ¼ˆMulti-Head Self-Attentionļ¼‰ļ¼šå…č®øęØ”åž‹åœØå¤„ē†ęŸäøŖčÆę—¶ļ¼ŒåŒę—¶å…³ę³Øč¾“å…„åŗåˆ—äø­ēš„ę‰€ęœ‰å…¶ä»–čÆļ¼Œę•ę‰åˆ°é•æč·ē¦»ēš„ä¾čµ–å…³ē³»ć€‚ +2. å‰é¦ˆē„žē»ē½‘ē»œļ¼šē”ØäŗŽåÆ¹ę³Øę„åŠ›ęœŗåˆ¶ēš„č¾“å‡ŗčæ›č”Œčæ›äø€ę­„å¤„ē†ć€‚ +čæ™ē§ęž¶ęž„åÆä»„é«˜åŗ¦å¹¶č”ŒåŒ–å¤„ē†åŗåˆ—ę•°ę®ļ¼Œē›øęÆ”äŗŽRNN/LSTMļ¼Œęžå¤§åœ°ęå‡äŗ†č®”ē®—ę•ˆēŽ‡ć€‚ + +## Attention ęœŗåˆ¶ēš„å¼•å…„åŠå…¶ä¼˜åŠæ + +### Attention ęœŗåˆ¶ēš„åŸŗęœ¬ę€ęƒ³ + +Attention ęœŗåˆ¶ēš„å¼•å…„č§£å†³äŗ†ä¼ ē»ŸęØ”åž‹åœØå¤„ē†é•æåŗåˆ—ę—¶éš¾ä»„ę•ę‰å…Øå±€äæ”ęÆēš„é—®é¢˜ć€‚Attention é€ščæ‡äøŗęÆäøŖč¾“å…„čÆč®”ē®—äøŽå…¶ä»–ę‰€ęœ‰čÆēš„ē›øå…³ę€§ļ¼ŒåŠØę€č°ƒę•“ęØ”åž‹ēš„å…³ę³Øē‚¹ć€‚å…·ä½“ę„čÆ“ļ¼ŒAttention ęœŗåˆ¶ä¼šę ¹ę®ęŸ„čÆ¢ļ¼ˆQueryļ¼‰ć€é”®ļ¼ˆKeyļ¼‰ć€å€¼ļ¼ˆValueļ¼‰äø‰č€…ēš„ē›øä¼¼åŗ¦ļ¼ŒåŠØę€åœ°äøŗäøåŒēš„čÆåˆ†é…äøåŒēš„ęƒé‡ļ¼Œä»Žč€Œå¢žå¼ŗęØ”åž‹åÆ¹å…³é”®äøŠäø‹ę–‡äæ”ęÆēš„å…³ę³Øć€‚ + +### Self-Attentionļ¼ˆč‡Ŗę³Øę„åŠ›ęœŗåˆ¶ļ¼‰ + +Self-Attention 是 Transformer ēš„ę øåæƒļ¼ŒåœØč‡Ŗę³Øę„åŠ›ęœŗåˆ¶äø­ļ¼Œč¾“å…„åŗåˆ—äø­ēš„ęÆäøŖčÆäøä»…å…³ę³Øå®ƒå‘Øå›“ēš„čÆļ¼Œčæ˜åÆä»„äøŽę•“äøŖåŗåˆ—äø­ēš„å…¶ä»–čÆå»ŗē«‹å…³č”ć€‚čæ™ē§ęœŗåˆ¶ä½æå¾—ęØ”åž‹åÆä»„åŒę—¶ę•ę‰å…Øå±€ēš„äøŠäø‹ę–‡äæ”ęÆļ¼Œę— č®ŗčÆäøŽčÆä¹‹é—“ēš„č·ē¦»å¤ščæœć€‚ + +### å¤šå¤“ę³Øę„åŠ›ļ¼ˆMulti-Head Attention) + +å¤šå¤“ę³Øę„åŠ›ęœŗåˆ¶å…č®øęØ”åž‹åœØäøåŒēš„å­ē©ŗé—“äø­čæ›č”Œå¤šäøŖč‡Ŗę³Øę„åŠ›ēš„č®”ē®—ļ¼Œå¹¶å°†ē»“ęžœę‹¼ęŽ„åœØäø€čµ·ć€‚čæ™ę ·åÆä»„č®©ęØ”åž‹åœØäøåŒēš„ē»“åŗ¦äøŠę•ę‰åˆ°ę›“å¤šēš„čÆ­ä¹‰å…³ē³»ļ¼Œä½æå¾—TransformeråœØå¤„ē†å¤ę‚ä»»åŠ”ę—¶ę›“åŠ ēµę“»å’Œå¼ŗå¤§ć€‚ + +### Attentionęœŗåˆ¶ēš„ä¼˜åŠæ + +- ę•ę‰é•æč·ē¦»ä¾čµ–å…³ē³»ļ¼šAttention ęœŗåˆ¶åÆä»„åœØäø€ę¬”ę“ä½œäø­č€ƒč™‘ę•“äøŖåŗåˆ—ēš„ę‰€ęœ‰čÆļ¼Œä»Žč€Œęœ‰ę•ˆč§£å†³ä¼ ē»Ÿåŗåˆ—ęØ”åž‹ēš„é•æč·ē¦»ä¾čµ–é—®é¢˜ć€‚ +- å¹¶č”Œč®”ē®—ļ¼šAttention ęœŗåˆ¶äøå†ä¾čµ–åŗåˆ—ēš„é€ę­„å¤„ē†ļ¼ŒåÆä»„å¤§å¹…ęé«˜č®”ē®—ę•ˆēŽ‡ć€‚ +- ēµę“»ę€§ļ¼šAttention ęœŗåˆ¶čƒ½å¤ŸåŠØę€č°ƒę•“ęØ”åž‹ēš„å…³ę³Øē‚¹ļ¼Œé€‚åŗ”äøåŒēš„ä»»åŠ”éœ€ę±‚ć€‚ + +## å¤§č§„ęØ”é¢„č®­ē»ƒęØ”åž‹ēš„åŗ”ē”Ø + +Transformer ęž¶ęž„äøŗå¤§č§„ęØ”é¢„č®­ē»ƒęØ”åž‹ēš„å‡ŗēŽ°ęä¾›äŗ†åšå®žēš„åŸŗē”€ļ¼Œčæ™äŗ›ęØ”åž‹é€ščæ‡åœØęµ·é‡ę•°ę®äøŠčæ›č”Œé¢„č®­ē»ƒļ¼Œē„¶åŽåœØē‰¹å®šä»»åŠ”äøŠčæ›č”Œå¾®č°ƒļ¼Œä»Žč€Œęžå¤§ęå‡äŗ†ē”Ÿęˆä»»åŠ”ēš„ę€§čƒ½ć€‚ę­¤å¤–ļ¼ŒTransformer ęØ”åž‹äøä»…åœØč‡Ŗē„¶čÆ­čØ€å¤„ē†å’Œē”Ÿęˆä»»åŠ”äø­å–å¾—äŗ†å·Øå¤§ēš„ęˆåŠŸļ¼Œčæ˜č¢«å¹æę³›åŗ”ē”ØäŗŽå…¶ä»–ä»»åŠ”é¢†åŸŸć€‚ä»„äø‹ę˜Æäø€äŗ› Transformer ęØ”åž‹ēš„åŗ”ē”Øåœŗę™Æļ¼š + +### åÆ¹čÆē³»ē»ŸäøŽčŠå¤©ęœŗå™Øäŗŗ + +

+ anythingllm +

+ +- ę™ŗčƒ½å®¢ęœļ¼šē”Ÿęˆå¼åÆ¹čÆē³»ē»ŸåÆä»„ę ¹ę®ē”Øęˆ·č¾“å…„ē”Ÿęˆč‡Ŗē„¶ēš„åÆ¹čÆļ¼Œåø®åŠ©ä¼äøšęä¾›24/7ēš„å®¢ęˆ·ę”ÆęŒęœåŠ”ć€‚åƒ ChatGPT态Llama čæ™ę ·ēš„ęØ”åž‹åÆä»„å¤„ē†åøøč§é—®é¢˜å¹¶ęä¾›å®žę—¶åé¦ˆć€‚ +- č™šę‹ŸåŠ©ę‰‹ļ¼šč™šę‹ŸåŠ©ę‰‹å¦‚Siri态Alexa态Google Assistant åˆ©ē”ØčÆ­čØ€ē”ŸęˆęŠ€ęœÆē”ŸęˆčÆ­éŸ³ęˆ–ę–‡ęœ¬å“åŗ”ļ¼Œåø®åŠ©ē”Øęˆ·å®Œęˆä»»åŠ”ęˆ–ęä¾›äæ”ęÆć€‚ + +### å›¾åƒå¤„ē†äøŽč®”ē®—ęœŗč§†č§‰ + +

+ anythingllm +

+ +å›¾åƒę„ęŗ: https://www.jetson-ai-lab.com/vit/tutorial_sam.html + +- Vision Transformer (ViT):ViT 是 Transformer åœØč®”ē®—ęœŗč§†č§‰äø­ēš„åŗ”ē”Øļ¼Œē›“ęŽ„å°†å›¾åƒåˆ‡åˆ†äøŗå°å—ļ¼ˆPatchļ¼‰ļ¼Œē„¶åŽå°†čæ™äŗ›å°å—å½“ä½œåŗåˆ—č¾“å…„Transformerčæ›č”Œå›¾åƒåˆ†ē±»ć€‚ViTäøŽä¼ ē»Ÿēš„å·ē§Æē„žē»ē½‘ē»œļ¼ˆCNNļ¼‰ē›øęÆ”ļ¼ŒåœØå¤§č§„ęØ”ę•°ę®é›†äøŠå…·ęœ‰ę›“å¼ŗēš„ę€§čƒ½ć€‚ +- å›¾åƒē”Ÿęˆļ¼šTransformer ä¹ŸåÆä»„ē”ØäŗŽå›¾åƒē”Ÿęˆä»»åŠ”ļ¼Œé€ščæ‡åÆ¹å›¾åƒåƒē“ åŗåˆ—čæ›č”Œå»ŗęØ”ļ¼Œē”Ÿęˆé«˜č“Øé‡ēš„å›¾åƒć€‚ + +### čÆ­éŸ³å¤„ē† + +

+ anythingllm +

+ +- čÆ­éŸ³čÆ†åˆ«ļ¼šTransformeråœØčÆ­éŸ³čÆ†åˆ«ä»»åŠ”äø­ä¹Ÿå–å¾—äŗ†ę˜¾č‘—čæ›å±•ļ¼Œåƒ Conformer ęØ”åž‹ē»“åˆäŗ†å·ē§Æå’Œ Transformer ēš„ä¼˜åŠæļ¼ŒåœØå®žę—¶čÆ­éŸ³č½¬ę–‡ęœ¬ļ¼ˆASRļ¼‰ē³»ē»Ÿäø­ęœ‰č‰Æå„½č”ØēŽ°ć€‚ +- čÆ­éŸ³ē”Ÿęˆļ¼šTransformer åœØę–‡ęœ¬č½¬čÆ­éŸ³ļ¼ˆTTSļ¼‰ä»»åŠ”äø­ä¹Ÿč¢«å¹æę³›ä½æē”Øļ¼ŒęØ”åž‹å¦‚ Tacotron 2 ē»“åˆ Attention ęœŗåˆ¶å’Œ RNN/Transformerļ¼Œäøŗē”Ÿęˆč‡Ŗē„¶ēš„čÆ­éŸ³åˆęˆęä¾›äŗ†ä¼˜č“Øč§£å†³ę–¹ę”ˆć€‚ + +### ę—¶é—“åŗåˆ—é¢„ęµ‹ + +- é‡‘čžåø‚åœŗåˆ†ęžäøŽé¢„ęµ‹ļ¼šTransformerč¢«ē”Øę„å¤„ē†é‡‘čžåø‚åœŗēš„ę—¶é—“åŗåˆ—ę•°ę®ļ¼ŒęØ”åž‹čƒ½å¤Ÿé€ščæ‡Attentionęœŗåˆ¶ę•ę‰åˆ°é•æč·ē¦»ēš„ę—¶é—“ä¾čµ–å…³ē³»ļ¼Œä»Žč€Œę›“å‡†ē”®åœ°é¢„ęµ‹č‚”ē„Øä»·ę ¼ć€åø‚åœŗč¶‹åŠæē­‰ć€‚ +- čƒ½ęŗę¶ˆč€—é¢„ęµ‹ļ¼šåœØčƒ½ęŗē®”ē†äø­ļ¼ŒTransformerč¢«ē”Øę„é¢„ęµ‹ęœŖę„ēš„čƒ½ęŗéœ€ę±‚ļ¼Œé€ščæ‡åˆ†ęžé•æę—¶é—“ēš„čƒ½č€—ę•°ę®č¶‹åŠæļ¼Œåø®åŠ©ä¼˜åŒ–čƒ½ęŗåˆ†é…å’Œč°ƒåŗ¦ć€‚ + +### ęœŗå™ØäŗŗęŽ§åˆ¶äøŽå¼ŗåŒ–å­¦ä¹  + +

+ anythingllm +

+ +å›¾ē‰‡ę„ęŗļ¼šhttps://www.jetson-ai-lab.com/lerobot.html + +- ęœŗå™Øäŗŗč·Æå¾„č§„åˆ’ļ¼šTransformeråœØęœŗå™ØäŗŗęŽ§åˆ¶ä»»åŠ”äø­č¢«åŗ”ē”ØäŗŽč·Æå¾„č§„åˆ’å’Œä»»åŠ”å†³ē­–ļ¼Œé€ščæ‡å»ŗęØ”ęœŗå™Øäŗŗēš„ēŠ¶ę€åŗåˆ—ļ¼Œä¼˜åŒ–č·Æå¾„č§„åˆ’ēš„ę•ˆęžœć€‚ +- ęøøęˆAIļ¼šåœØå¼ŗåŒ–å­¦ä¹ ä»»åŠ”äø­ļ¼ŒTransformerč¢«ē”Øę„å¤„ē†å¤ę‚ēš„ęøøęˆēŽÆå¢ƒļ¼Œåø®åŠ©č®­ē»ƒå‡ŗę›“ę™ŗčƒ½ēš„ęøøęˆAIļ¼Œå¦‚åœØę£‹ē±»ęøøęˆå’Œå®žę—¶ē­–ē•„ęøøęˆäø­č”ØēŽ°å‡ŗč‰²ć€‚ + +### čÆē‰©å‘ēŽ° + +- åŒ–åˆē‰©ē”ŸęˆäøŽä¼˜åŒ–ļ¼šTransformer ęØ”åž‹ē”ØäŗŽē”Ÿęˆå’Œä¼˜åŒ–åŒ–å­¦åŒ–åˆē‰©åˆ†å­ē»“ęž„ļ¼Œåø®åŠ©å‘ēŽ°ę½œåœØēš„čÆē‰©åˆ†å­ļ¼Œęå‡čÆē‰©č®¾č®”ēš„ę•ˆēŽ‡ć€‚ + +é€ščæ‡čæ™äŗ›ä¾‹å­ļ¼ŒåÆä»„ēœ‹å‡ŗ Transformer ęØ”åž‹ēš„ēµę“»ę€§å’Œå¼ŗå¤§čƒ½åŠ›ļ¼Œä½æå…¶äøä»…é™äŗŽč‡Ŗē„¶čÆ­čØ€å¤„ē†ļ¼Œčæ˜å¹æę³›åŗ”ē”ØäŗŽå¤šäøŖä»»åŠ”å’Œé¢†åŸŸļ¼Œå¹¶ęŽØåŠØäŗ†å„äøŖé¢†åŸŸēš„ęŠ€ęœÆčæ›ę­„ć€‚ + +## å®žéŖŒļ¼šåˆ†ęžå¹¶ę”¹čæ›é¢„č®­ē»ƒęØ”åž‹čæ›č”Œē‰¹å®šä»»åŠ” + +åœØäø‹é¢ēš„å®žéŖŒäø­ļ¼Œęˆ‘ä»¬å°†å­¦ä¹ å¦‚ä½•é€ščæ‡åŠ č½½é¢„č®­ē»ƒęØ”åž‹å¹¶čæ›č”Œå¾®č°ƒļ¼Œę„å®Œęˆē‰¹å®šēš„ē”Ÿęˆä»»åŠ”ć€‚čæ™é‡Œęˆ‘ä»¬ę¼”ē¤ŗēš„ę˜Æč®©č‹±ę–‡čŠå¤©ęØ”åž‹ļ¼ˆPhi-1.5ļ¼‰ę”ÆęŒäø­ę–‡ć€‚ + +

+ llama-factory +

+ + +### step1. é…ē½®č®­ē»ƒēŽÆå¢ƒ + +é€‰ę‹©äø€äøŖē”¬ä»¶č®¾å¤‡ļ¼Œčæ™é‡Œęˆ‘ä½æē”Øēš„ę˜Æ Nvidia Jetson AGX Orin 64GBļ¼Œä½ ä¹ŸåÆä»„ä½æē”Øę˜¾å­˜ę›“ä½Žēš„ Jetson Orin NX 16GBć€‚ē„¶åŽä½æē”Ø jetson-examples 安装 llama-factory怂 + +> - Nvidia Jetson AGX Orin 64GBļ¼šé«˜ę€§čƒ½č¾¹ē¼˜č®”ē®—č®¾å¤‡ +> - jetson-examplesļ¼šåæ«é€ŸéƒØē½²ēƒ­é—Øé”¹ē›®åˆ° Jetson č®¾å¤‡ēš„å·„å…· +> - llama-factoryļ¼šč½»ę¾č®­ē»ƒęØ”åž‹ēš„å·„å…· + +打开 jetson č®¾å¤‡ēš„ē»ˆē«Æå¹¶ę‰§č”Œļ¼š + +```bash +pip3 install jetson-examples +sudo reboot +reComputer run llama-factory +``` + +å¦‚ęžœę‰€ęœ‰å‘½ä»¤éƒ½ęˆåŠŸę‰§č”Œļ¼Œęˆ‘ä»¬å°±åÆä»„ä½æē”Øęµč§ˆå™Øč®æé—® llama-factory ēš„ WebUI怂 + +```bash +# http://:7860 +http://127.0.0.1:7860 +``` + +### step2. 启动训练 +在 WebUI äø­é…ē½®é¢„č®­ē»ƒęØ”åž‹å’Œäø­ę–‡ę•°ę®é›†ļ¼Œå¹¶åÆåŠØč®­ē»ƒć€‚ + +

+ train +

+ +### step3. ę•ˆęžœęµ‹čÆ• +ē­‰å¾…ęØ”åž‹č®­ē»ƒå®Œęˆļ¼ŒåÆä»„ä½æē”Ø llama-factory å·„å…·åŠ č½½å¾®č°ƒåŽēš„ęØ”åž‹å¹¶ęµ‹čÆ•å…¶ę•ˆęžœć€‚ä»Žäø‹ę–¹ęˆŖå›¾åÆä»„å‘ēŽ°ļ¼Œå¾®č°ƒåŽēš„ęØ”åž‹ä»„åŠå…·å¤‡äø­ę–‡ē”Ÿęˆēš„čƒ½åŠ›ć€‚ + +

+ test + test +

+ + +> ę›“åŠ čÆ¦ē»†ēš„å®žéŖŒå†…å®¹čÆ·ęŸ„ēœ‹ https://wiki.seeedstudio.com/Finetune_LLM_on_Jetson/ + +## ę›“å¤šå‚č€ƒå†…å®¹ + +- https://github.com/hiyouga/LLaMA-Factory +- https://github.com/Seeed-Projects/jetson-examples + diff --git a/5-Generative-AI/5.6-Local-Intelligent-Q&A-System/README.md b/5-Generative-AI/5.6-Local-Intelligent-Q&A-System/README.md new file mode 100644 index 0000000..142b599 --- /dev/null +++ b/5-Generative-AI/5.6-Local-Intelligent-Q&A-System/README.md @@ -0,0 +1,110 @@ +# Local Intelligent Q&A System + +In the field of artificial intelligence today, general large language models (such as GPT-3, GPT-4, etc.) are widely used to build chatbots. These models possess powerful natural language processing capabilities and can handle a variety of questions, generating answers that resemble human language. However, despite their excellent performance in general scenarios, these models often lack precision in specific domain-related Q&A scenarios. This is because general large language models lack a deep understanding of specific domain knowledge; their training data is vast but not targeted, leading to inaccurate, verbose, or irrelevant answers when dealing with specific scenarios. + +To address this issue, the industry has proposed a series of solutions tailored for specific scenarios to enhance the professionalism and accuracy of chatbots. This article will introduce these solutions in detail and demonstrate them with a simple example. + +## Why Deploy LLM Locally? +Deploying large language models locally (localized deployment) is a choice based on different business needs and technical scenarios. Here are some of the main reasons: + +### Data Privacy and Security + +In scenarios with stringent requirements for data privacy and security, such as healthcare, finance, and government, localized deployment is the inevitable choice. Sensitive data in these fields may not be suitable for upload to the cloud for processing due to strict legal and compliance requirements. Deploying large language models locally ensures that data does not leave the local network environment, thus avoiding data leaks or external risks. + +### Reduced Network Dependency + +Cloud deployment typically relies on internet connectivity. In situations with unstable network conditions or limited bandwidth, the performance of cloud service access may suffer. Localized deployment avoids dependency on external networks, ensuring that the system operates efficiently at all times, especially in scenarios requiring real-time responses. + +### Customization and Control + +Deploying large language models locally allows developers to customize and optimize the models to better suit specific business needs. With cloud services, users can only access generic APIs, whereas local deployment enables developers to fine-tune, extend, and have greater flexibility in adjusting model behavior. + +### Cost Optimization +While the cloud offers convenient scalability and maintenance services, the long-term use of large models in the cloud can incur significant costs, particularly for businesses that require continuous model invocations. Deploying models locally can save on cloud service expenses, especially for enterprises with large-scale usage. + +## Solutions to the Limitations of General Models + +Facing the problem of insufficient performance of general models in specific scenarios, the following solutions have been widely applied and recognized: + +### Knowledge-Enhanced Models (Retrieval-Augmented Generation, RAG) + +RAG is a technical framework that combines retrieval with generation, addressing the knowledge gaps in large language models by introducing external knowledge bases. In the RAG architecture, the user’s query is first sent to a retrieval module, which extracts relevant documents or information from a pre-built knowledge base. These pieces of information are then combined with the generation model to produce more accurate answers. This method leverages the linguistic capabilities of the generation model while ensuring the professionalism and accuracy of the answers. This approach is particularly suitable for domain-specific Q&A scenarios, such as healthcare, law, or corporate internal knowledge Q&A. + +

+ RAG +

+ +Image source: https://transformers.run/c1/transformer/ + +### Fine-Tuning Models +Fine-tuning is another common method used to optimize general models for specific scenarios. By fine-tuning pre-trained language models, the models can become more adapted to specific domain data and tasks. For example, one can retrain large language models using internal company documents, technical manuals, or industry-specific Q&A data. This approach can significantly improve the model’s performance in specific domains. + +The downside of fine-tuning is that it requires a substantial amount of domain-specific data, and the training process may demand considerable computational resources. However, once fine-tuning is complete, the model becomes more professional and precise in handling domain-specific issues. + +Additionally, there are many other solutions such as Knowledge Graphs, Rule-Based Systems, Hybrid Systems, and more. The appropriate method can be chosen based on the actual situation. + +## Building a Domain-Specific Q&A System Using RAG + +We will demonstrate how to build a knowledge base-driven chatbot using RAG through a simple example. For the sake of demonstration, we will use Ollama as the inference engine to load the large language model, Anything LLM to construct the RAG knowledge base, and deploy all services to a local Jetson device. + +> - **Ollama:** Inference engine for large language models +> - **Anything LLM:** AI application construction tool +> - **Jetson:** High-performance edge computing device + +

+ RAG +

+ +### Step 1. Install and Run Ollama + +Here, we use [`jetson-examples`](https://github.com/Seeed-Projects/jetson-examples) to quickly deploy the large language model to the Jetson device. + +```bash +sudo apt install python3-pip +pip3 install jetson-examples +reComputer run ollama +ollama run llama3 +``` + +> Please keep this terminal active. + +### Step 2. Install and Run AnythingLLM + +We can deploy AnythingLLM directly using Docker: + +```bash +docker pull mintplexlabs/anythingllm + +export STORAGE_LOCATION=$HOME/anythingllm +mkdir -p $STORAGE_LOCATION +touch "$STORAGE_LOCATION/.env" +docker run -d -p 3001:3001 --cap-add SYS_ADMIN \ + -v ${STORAGE_LOCATION}:/app/server/storage \ + -v ${STORAGE_LOCATION}/.env:/app/server/.env \ + -e STORAGE_DIR="/app/server/storage" \ + mintplexlabs/anythingllm +``` + +### Step 3. Configure the Local Knowledge Base + +Once AnythingLLM is successfully started, we can use a browser to open `http://:3001` to access its WebUI interface and upload the knowledge base file in the WebUI. + +Here, I have used ChatGPT to generate a few [short stories](./story1.txt) and uploaded these stories to AnythingLLM. + +

+ upload +

+ +### Step 4. Effectiveness Testing + +

+ test +

+ + +## More content + +- https://docs.anythingllm.com/ +- https://wiki.seeedstudio.com/local_ai_ssistant/ +- https://wiki.seeedstudio.com/Local_RAG_based_on_Jetson_with_LlamaIndex/ +- https://github.com/Seeed-Projects/jetson-examples \ No newline at end of file diff --git a/5-Generative-AI/5.6-Local-Intelligent-Q&A-System/images/RAG.png b/5-Generative-AI/5.6-Local-Intelligent-Q&A-System/images/RAG.png new file mode 100644 index 0000000..76c929e Binary files /dev/null and b/5-Generative-AI/5.6-Local-Intelligent-Q&A-System/images/RAG.png differ diff --git a/5-Generative-AI/5.6-Local-Intelligent-Q&A-System/images/anythingllm.png b/5-Generative-AI/5.6-Local-Intelligent-Q&A-System/images/anythingllm.png new file mode 100644 index 0000000..3856f51 Binary files /dev/null and b/5-Generative-AI/5.6-Local-Intelligent-Q&A-System/images/anythingllm.png differ diff --git a/5-Generative-AI/5.6-Local-Intelligent-Q&A-System/images/test.png b/5-Generative-AI/5.6-Local-Intelligent-Q&A-System/images/test.png new file mode 100644 index 0000000..ad9e0b2 Binary files /dev/null and b/5-Generative-AI/5.6-Local-Intelligent-Q&A-System/images/test.png differ diff --git a/5-Generative-AI/5.6-Local-Intelligent-Q&A-System/images/upload.png b/5-Generative-AI/5.6-Local-Intelligent-Q&A-System/images/upload.png new file mode 100644 index 0000000..2dac3b0 Binary files /dev/null and b/5-Generative-AI/5.6-Local-Intelligent-Q&A-System/images/upload.png differ diff --git a/5-Generative-AI/5.6-Local-Intelligent-Q&A-System/language/README_zh-CN.md b/5-Generative-AI/5.6-Local-Intelligent-Q&A-System/language/README_zh-CN.md new file mode 100644 index 0000000..729283c --- /dev/null +++ b/5-Generative-AI/5.6-Local-Intelligent-Q&A-System/language/README_zh-CN.md @@ -0,0 +1,109 @@ +# ęœ¬åœ°ę™ŗčƒ½é—®ē­”ē³»ē»Ÿ +åœØå¦‚ä»Šēš„äŗŗå·„ę™ŗčƒ½é¢†åŸŸļ¼Œé€šē”Øå¤§čÆ­čØ€ęØ”åž‹ļ¼ˆå¦‚GPT-3态GPT-4ē­‰ļ¼‰č¢«å¹æę³›åŗ”ē”ØäŗŽęž„å»ŗčŠå¤©ęœŗå™Øäŗŗć€‚čæ™äŗ›ęØ”åž‹å…·å¤‡å¼ŗå¤§ēš„č‡Ŗē„¶čÆ­čØ€å¤„ē†čƒ½åŠ›ļ¼Œčƒ½å¤Ÿå¤„ē†å„ē§é—®é¢˜ļ¼Œå¹¶ē”Ÿęˆäŗŗē±»čÆ­čØ€čˆ¬ēš„å›žē­”ć€‚ē„¶č€Œļ¼Œå°½ē®”čæ™äŗ›ęØ”åž‹åœØé€šē”Øåœŗę™Æäø‹č”ØēŽ°å‡ŗč‰²ļ¼Œä½†åœØē‰¹å®šé¢†åŸŸēš„é—®ē­”åœŗę™Æäø­ļ¼Œå¾€å¾€ę˜¾å¾—äøå¤Ÿē²¾å‡†ć€‚čæ™ę˜Æå› äøŗé€šē”Øå¤§čÆ­čØ€ęØ”åž‹ē¼ŗä¹åÆ¹ē‰¹å®šé¢†åŸŸēŸ„čÆ†ēš„ę·±å…„ē†č§£ļ¼Œå…¶č®­ē»ƒę•°ę®č™½ē„¶åŗžå¤§ä½†ē¼ŗä¹é’ˆåÆ¹ę€§ļ¼ŒåÆ¼č‡“åœØå¤„ē†ē‰¹å®šåœŗę™Æäø‹ēš„é—®é¢˜ę—¶ļ¼ŒåÆčƒ½ä¼šå‡ŗēŽ°äøå‡†ē”®ć€å†—é•æęˆ–äøē›øå…³ēš„å›žē­”ć€‚ + +äøŗäŗ†åŗ”åÆ¹čæ™äø€é—®é¢˜ļ¼Œäøšē•Œęå‡ŗäŗ†äø€ē³»åˆ—é’ˆåÆ¹ē‰¹å®šåœŗę™Æēš„č§£å†³ę–¹ę”ˆļ¼Œä»„ęå‡čŠå¤©ęœŗå™Øäŗŗēš„äø“äøšę€§å’Œå‡†ē”®ę€§ć€‚ęœ¬ę–‡å°†čÆ¦ē»†ä»‹ē»čæ™äŗ›č§£å†³ę–¹ę”ˆļ¼Œå¹¶é€ščæ‡äø€äøŖē®€å•ēš„ä¾‹å­čæ›č”Œę¼”ē¤ŗć€‚ + +## äøŗä½•č¦éƒØē½²åœØęœ¬åœ°ļ¼Ÿ + +å°†å¤§čÆ­čØ€ęØ”åž‹éƒØē½²åˆ°ęœ¬åœ°ļ¼ˆęœ¬åœ°åŒ–éƒØē½²ļ¼‰ę˜Æé’ˆåÆ¹äøåŒäøšåŠ”éœ€ę±‚å’ŒęŠ€ęœÆåœŗę™Æę„é€‰ę‹©ēš„ć€‚ä»„äø‹ę˜Æäø€äŗ›äø»č¦ēš„åŽŸå› ļ¼š + +### 1.ę•°ę®éšē§äøŽå®‰å…Ø + +åœØäø€äŗ›åÆ¹ę•°ę®éšē§å’Œå®‰å…Øę€§č¦ę±‚ęžé«˜ēš„åœŗę™Æäø­ļ¼Œęœ¬åœ°åŒ–éƒØē½²ę˜Æåæ…ē„¶é€‰ę‹©ć€‚ä¾‹å¦‚ļ¼ŒåŒ»ē–—ć€é‡‘čžć€ę”æåŗœē­‰é¢†åŸŸēš„ę•ę„Ÿę•°ę®åÆčƒ½äøé€‚åˆäøŠä¼ åˆ°äŗ‘ē«Æčæ›č”Œå¤„ē†ļ¼Œå› å…¶ę¶‰åŠäø„ę ¼ēš„ę³•å¾‹ę³•č§„å’Œåˆč§„č¦ę±‚ć€‚å°†å¤§čÆ­čØ€ęØ”åž‹éƒØē½²åˆ°ęœ¬åœ°åÆä»„ē”®äæę•°ę®äøä¼šē¦»å¼€ęœ¬åœ°ē½‘ē»œēŽÆå¢ƒļ¼Œéæå…äŗ†ę•°ę®ę³„éœ²ęˆ–å¤–éƒØé£Žé™©ć€‚ + +### 2.é™ä½Žē½‘ē»œä¾čµ– + +äŗ‘ē«ÆéƒØē½²é€šåøøä¾čµ–äŗŽäŗ’č”ē½‘čæžęŽ„ļ¼Œå°¤å…¶åœØē½‘ē»œę”ä»¶äøēØ³å®šęˆ–åø¦å®½å—é™ēš„ęƒ…å†µäø‹ļ¼Œč®æé—®äŗ‘ē«ÆęœåŠ”ēš„ę€§čƒ½åÆčƒ½äø‹é™ć€‚ęœ¬åœ°åŒ–éƒØē½²åÆä»„éæå…åÆ¹å¤–éƒØē½‘ē»œēš„ä¾čµ–ļ¼Œē”®äæē³»ē»ŸåœØä»»ä½•ę—¶å€™éƒ½čƒ½é«˜ę•ˆčæč”Œļ¼Œē‰¹åˆ«ę˜ÆåœØéœ€č¦å®žę—¶å“åŗ”ēš„åœŗę™Æäø­ć€‚ + +### 3.å®šåˆ¶åŒ–äøŽęŽ§åˆ¶ + +åœØęœ¬åœ°éƒØē½²å¤§čÆ­čØ€ęØ”åž‹å…č®øå¼€å‘č€…åÆ¹ęØ”åž‹čæ›č”Œå®šåˆ¶å’Œä¼˜åŒ–ļ¼Œä»„ę›“å„½åœ°é€‚åŗ”ē‰¹å®šäøšåŠ”éœ€ę±‚ć€‚åœØäŗ‘ē«ÆęœåŠ”äø­ļ¼Œē”Øęˆ·åŖčƒ½č®æé—®é€šē”Øēš„APIļ¼Œč€Œęœ¬åœ°éƒØē½²åÆä»„č®©å¼€å‘č€…åÆ¹ęØ”åž‹čæ›č”Œå¾®č°ƒć€ę‰©å±•ļ¼Œå¹¶ęœ‰ę›“å¤§ēš„ēµę“»ę€§ę„č°ƒę•“ęØ”åž‹č”Œäøŗć€‚ + +### 4.ęˆęœ¬ä¼˜åŒ– + +å°½ē®”äŗ‘ē«Æęä¾›äŗ†ä¾æåˆ©ēš„ę‰©å±•čƒ½åŠ›å’Œē»“ęŠ¤ęœåŠ”ļ¼Œä½†é•æęœŸä½æē”Øå¤§åž‹ęØ”åž‹ēš„äŗ‘ęœåŠ”åÆčƒ½ä¼šåø¦ę„å·Øé¢ęˆęœ¬ļ¼Œē‰¹åˆ«ę˜ÆåÆ¹é‚£äŗ›éœ€č¦ęŒē»­č°ƒē”ØęØ”åž‹ēš„äøšåŠ”åœŗę™Æć€‚åœØęœ¬åœ°éƒØē½²ęØ”åž‹åÆä»„čŠ‚ēŗ¦äŗ‘ęœåŠ”ēš„č“¹ē”Øļ¼Œē‰¹åˆ«ę˜ÆåÆ¹äŗŽå¤§č§„ęØ”ä½æē”Øēš„ä¼äøšć€‚ + +## é’ˆåÆ¹é€šē”ØęØ”åž‹å±€é™ę€§ēš„č§£å†³ę–¹ę”ˆ + +é¢åÆ¹é€šē”ØęØ”åž‹åœØē‰¹å®šåœŗę™Æäø‹č”ØēŽ°äøč¶³ēš„é—®é¢˜ļ¼Œä»„äø‹å‡ ē§č§£å†³ę–¹ę”ˆå¾—åˆ°äŗ†å¹æę³›ēš„åŗ”ē”Øå’Œč®¤åÆļ¼š + +### 1.ēŸ„čÆ†å¢žå¼ŗęØ”åž‹ (Retrieval-Augmented Generation, RAG) + +RAG ę˜Æäø€ē§ē»“åˆę£€ē“¢äøŽē”Ÿęˆēš„ęŠ€ęœÆę”†ęž¶ļ¼Œå®ƒé€ščæ‡å¼•å…„å¤–éƒØēŸ„čÆ†åŗ“ę„å¼„č”„å¤§čÆ­čØ€ęØ”åž‹ēš„ēŸ„čÆ†ē›²åŒŗć€‚åœØRAGęž¶ęž„äø­ļ¼Œē”Øęˆ·č¾“å…„ēš„ęŸ„čÆ¢é¦–å…ˆč¢«é€å…„äø€äøŖę£€ē“¢ęØ”å—ļ¼Œę£€ē“¢ęØ”å—ä»Žé¢„å…ˆęž„å»ŗēš„ēŸ„čÆ†åŗ“äø­ęå–ē›øå…³ę–‡ę”£ęˆ–äæ”ęÆļ¼Œē„¶åŽå°†čæ™äŗ›äæ”ęÆäøŽē”ŸęˆęØ”åž‹ē»“åˆļ¼Œē”Ÿęˆę›“åŠ ē²¾å‡†ēš„ē­”ę”ˆć€‚čÆ„ę–¹ę³•ę—¢čƒ½å‘ęŒ„ē”ŸęˆęØ”åž‹ēš„čÆ­čØ€čƒ½åŠ›ļ¼Œåˆčƒ½ē”®äæå›žē­”ēš„äø“äøšę€§å’Œå‡†ē”®ę€§ć€‚čæ™ē§ę–¹ę³•ē‰¹åˆ«é€‚åˆē‰¹å®šé¢†åŸŸēš„é—®ē­”åœŗę™Æļ¼Œä¾‹å¦‚åŒ»ē–—ć€ę³•å¾‹ęˆ–ä¼äøšå†…éƒØēŸ„čÆ†é—®ē­”ć€‚ + +

+ RAG +

+ +å›¾ē‰‡ę„ęŗ: https://transformers.run/c1/transformer/ + +### 2.å¾®č°ƒęØ”åž‹ (Fine-Tuning) + +å¾®č°ƒę˜Æå¦äø€ē§é’ˆåÆ¹ē‰¹å®šåœŗę™Æä¼˜åŒ–é€šē”ØęØ”åž‹ēš„åøøē”Øę–¹ę³•ć€‚é€ščæ‡åÆ¹é¢„č®­ē»ƒčÆ­čØ€ęØ”åž‹čæ›č”Œå¾®č°ƒļ¼ŒęØ”åž‹åÆä»„ę›“åŠ é€‚åŗ”ē‰¹å®šé¢†åŸŸēš„ę•°ę®å’Œä»»åŠ”ć€‚ä¾‹å¦‚ļ¼ŒåÆä»„ä½æē”Øå…¬åøå†…éƒØēš„ę–‡ę”£ć€ęŠ€ęœÆę‰‹å†Œęˆ–č”Œäøšē‰¹å®šēš„é—®ē­”ę•°ę®ļ¼ŒåÆ¹å¤§čÆ­čØ€ęØ”åž‹čæ›č”Œå†č®­ē»ƒć€‚čæ™ē§ę–¹ę³•åÆä»„ęžå¤§åœ°ęé«˜ęØ”åž‹åœØē‰¹å®šé¢†åŸŸēš„č”ØēŽ°ć€‚ +å¾®č°ƒēš„ē¼ŗē‚¹ę˜Æéœ€č¦å¤§é‡ēš„é¢†åŸŸę•°ę®ļ¼Œå¹¶äø”č®­ē»ƒčæ‡ēØ‹åÆčƒ½éœ€č¦å¼ŗå¤§ēš„č®”ē®—čµ„ęŗć€‚ē„¶č€Œļ¼Œäø€ę—¦å®Œęˆå¾®č°ƒļ¼ŒęØ”åž‹åœØå¤„ē†ē‰¹å®šé¢†åŸŸé—®é¢˜ę—¶ä¼šå˜å¾—ę›“åŠ äø“äøšå’Œē²¾ē”®ć€‚ + +ę­¤å¤–ļ¼Œčæ˜ęœ‰ēŸ„čÆ†å›¾č°± (Knowledge Graph)ć€č§„åˆ™é©±åŠØēš„ē³»ē»Ÿ (Rule-Based Systems)ć€ę··åˆē³»ē»Ÿ (Hybrid Systems)ē­‰č®øå¤šč§£å†³ę–¹ę”ˆļ¼Œå¤§å®¶åÆä»„ę ¹ę®å®žé™…ęƒ…å†µę„é€‰ę‹©å’Œę˜Æēš„ę–¹ę³•ć€‚ + +## 使用RAGęØ”åž‹ęž„å»ŗé¢†åŸŸē‰¹å®šēš„é—®ē­”ē³»ē»Ÿ + +ęˆ‘ä»¬å°†é€ščæ‡äø€äøŖē®€å•ēš„ä¾‹å­å±•ē¤ŗå¦‚ä½•ä½æē”Ø RAG ę„ęž„å»ŗäø€äøŖēŸ„čÆ†åŗ“é©±åŠØēš„čŠå¤©ęœŗå™Øäŗŗć€‚äøŗäŗ†ę–¹ä¾æę¼”ē¤ŗļ¼Œęˆ‘ä»¬ä½æē”Ø Ollama åšäøŗęŽØē†å¼•ę“Žę„åŠ č½½å¤§čÆ­čØ€ęØ”åž‹ļ¼Œä½æē”Ø Anything LLM ę„ęž„å»ŗ RAG ēŸ„čÆ†åŗ“ļ¼Œå¹¶å°†ę‰€ęœ‰ęœåŠ”éƒØē½²åˆ°ęœ¬åœ° Jetson 设备中。 + +> - **Ollama:** å¤§čÆ­čØ€ęØ”åž‹ęŽØē†å¼•ę“Ž +> - **Anything LLM:** AI åŗ”ē”Øęž„å»ŗå·„å…· +> - **Jetson:** é«˜ę€§čƒ½č¾¹ē¼˜č®”ē®—č®¾å¤‡ + +

+ RAG +

+ +### step1. å®‰č£…å¹¶čæč”Œ Ollama + +čæ™é‡Œļ¼Œęˆ‘ä»¬ä½æē”Ø jetson-examples åæ«é€Ÿå°†å¤§čÆ­čØ€ęØ”åž‹éƒØē½²åˆ° Jetson 设备中。 + +```bash +sudo apt install python3-pip +pip3 install jetson-examples +reComputer run ollama +ollama run llama3 +``` + +> Note: čÆ·äøč¦å…³é—­čÆ„ē»ˆē«ÆēŖ—å£ć€‚ + +### step2. å®‰č£…å¹¶čæč”Œ AnythingLLM + +ęˆ‘ä»¬åÆä»„ē›“ęŽ„ä½æē”Ø docker ę„éƒØē½² AnythingLLM: + +```bash +docker pull mintplexlabs/anythingllm + +export STORAGE_LOCATION=$HOME/anythingllm +mkdir -p $STORAGE_LOCATION +touch "$STORAGE_LOCATION/.env" +docker run -d -p 3001:3001 --cap-add SYS_ADMIN \ + -v ${STORAGE_LOCATION}:/app/server/storage \ + -v ${STORAGE_LOCATION}/.env:/app/server/.env \ + -e STORAGE_DIR="/app/server/storage" \ + mintplexlabs/anythingllm +``` + +### step3. é…ē½®ęœ¬åœ°ēŸ„čÆ†åŗ“ +AnythingLLM åÆåŠØęˆåŠŸåŽļ¼Œęˆ‘ä»¬åÆä»„ä½æē”Øęµč§ˆå™Øę‰“å¼€ `http://:3001` ę„č®æé—®å…¶ WebUI ē•Œé¢ļ¼Œå¹¶åœØ WebUI äø­äøŠä¼ ēŸ„čÆ†åŗ“ę–‡ä»¶ć€‚ +čæ™é‡Œęˆ‘ä½æē”Ø ChatGPT ē”Ÿęˆäŗ†å‡ äøŖ[å°ę•…äŗ‹](../story1.txt)ļ¼Œå¹¶å°†čæ™äŗ›å°ę•…äŗ‹äøŠä¼ č‡³ AnythingLLM 中。 + +

+ upload +

+ +### step4. ę•ˆęžœęµ‹čÆ• + +

+ test +

+ + +## ę›“å¤šå†…å®¹ + +- https://docs.anythingllm.com/ +- https://wiki.seeedstudio.com/local_ai_ssistant/ +- https://wiki.seeedstudio.com/Local_RAG_based_on_Jetson_with_LlamaIndex/ +- https://github.com/Seeed-Projects/jetson-examples diff --git a/5-Generative-AI/5.6-Local-Intelligent-Q&A-System/story1.txt b/5-Generative-AI/5.6-Local-Intelligent-Q&A-System/story1.txt new file mode 100644 index 0000000..59cb3c8 --- /dev/null +++ b/5-Generative-AI/5.6-Local-Intelligent-Q&A-System/story1.txt @@ -0,0 +1,31 @@ +Once upon a time in a quaint village nestled between rolling hills, there was a young girl named Eliza who loved to explore the woods behind her home. The forest was a magical place, filled with tall trees that whispered secrets, streams that sang soft melodies, and flowers that seemed to glow under the moonlight. + +One sunny morning, Eliza set out on one of her adventures, her heart brimming with excitement. As she wandered deeper into the forest, she discovered a hidden path she had never seen before. The path was lined with shimmering stones that sparkled like stars. Curious and intrigued, Eliza followed it. + +After a short walk, the path led her to a magnificent clearing where a majestic oak tree stood in the center. At the base of the tree was a small, ornate door. It was covered in intricate carvings of animals and vines. Eliza, with her heart pounding with both excitement and nervousness, gently pushed the door open. + +Inside, she found herself in a cozy, enchanted room. There were shelves lined with books and strange artifacts, and a warm fire crackling in a stone hearth. In the middle of the room, a wise old owl perched on a branch of a large, leafy plant. + +The owl looked at Eliza with kind, knowing eyes. ā€œWelcome, young traveler,ā€ it hooted softly. ā€œI am Oliver, the guardian of this magical realm. Few people find their way here. You must have a special heart.ā€ + +Eliza’s eyes widened in awe. ā€œWhat is this place?ā€ she asked. + +ā€œThis is the Realm of Wonders,ā€ Oliver explained. ā€œIt is a place where dreams come to life and where those with pure intentions can find their heart’s true desire.ā€ + +Eliza gazed around the room, her curiosity piqued. ā€œWhat can I do here?ā€ + +Oliver smiled. ā€œYou can make a wish. But remember, wishes made here come with great responsibility. They have the power to change not just your life but the lives of those around you.ā€ + +Eliza thought long and hard. She remembered how her village had been struggling with drought and how her friends and family were suffering. With a determined look, she made her wish. + +ā€œI wish for rain to fall upon my village and bring life back to the land.ā€ + +Oliver nodded approvingly. ā€œA selfless wish. It will be granted.ā€ + +The next morning, as Eliza returned to her village, dark clouds gathered in the sky, and a gentle rain began to fall. The villagers looked up in amazement as the parched earth drank in the life-giving water. The fields began to turn green, and the village flourished once more. + +Eliza’s heart swelled with joy as she realized the impact of her wish. The Realm of Wonders had given her the chance to make a difference, and she learned that true magic comes from caring for others. + +From that day on, Eliza continued to explore the woods, knowing that the true wonders of life were found in kindness and selflessness. + +And so, the village thrived, and Eliza’s adventures became the stuff of legends, reminding everyone that magic, indeed, begins with a kind heart. \ No newline at end of file diff --git a/5-Generative-AI/5.7-Image-Generation-Models-and-Diffusion-Models/README.md b/5-Generative-AI/5.7-Image-Generation-Models-and-Diffusion-Models/README.md new file mode 100644 index 0000000..b8a6321 --- /dev/null +++ b/5-Generative-AI/5.7-Image-Generation-Models-and-Diffusion-Models/README.md @@ -0,0 +1,56 @@ +# Local Intelligent Question Answering System + +This article will delve into the field of image generation, focusing on diffusion models. By integrating cutting-edge technologies such as the working principles of DALLĀ·E and Stable Diffusion, we aim to understand their practical applications and development potential. + +## Introduction to Diffusion Models in Image Generation + +

+ diffusion model +

+ +Image source:https://chrislee0728.medium.com/%E5%BE%9E%E9%A0%AD%E9%96%8B%E5%A7%8B%E5%AD%B8%E7%BF%92stable-diffusion-%E4%B8%80%E5%80%8B%E5%88%9D%E5%AD%B8%E8%80%85%E6%8C%87%E5%8D%97-ec34d7726a6c + +Diffusion models are a type of generative model that creates new images through a reverse diffusion process. Starting from Gaussian noise, diffusion models progressively infer the original image. This process can be seen as simulating a gradual ā€œdenoisingā€ procedure for image generation. It is a very powerful and efficient method of generation that has received widespread attention in recent years. + +## The Mechanism of Stable Diffusion + +The working principle of Stable Diffusion is based on the reverse process of diffusion models. The fundamental idea of diffusion models is to start with a completely random noise image and progressively infer a clear image. This process is divided into two stages: + +1. Forward Diffusion Process During the forward diffusion process, given a real image, noise is gradually added until it becomes a completely random noise image. This process can be understood as the ā€œdestructiveā€ step of the image. Noise is typically added incrementally through a Gaussian distribution, with each step making the image more blurred and noisy. + +2. Reverse Diffusion Process The reverse diffusion process is the core functioning part of Stable Diffusion. Given a completely random noise image, the model learns how to progressively denoise it to restore the original image. The reverse process is the opposite of the forward process; it uses a pre-trained neural network model to infer a clearer image at each step. Each step of the reverse process involves the model predicting the magnitude of the current noise and subtracting it from the current image to obtain a less noisy image. Ultimately, after multiple reverse steps, the model can generate a clear and high-resolution image from pure noise. + +

+ diffusion model +

+ +Image source:https://sushant-kumar.com/blog/ddpm-denoising-diffusion-probabilistic-models + +## Deploying the SD Model on Jetson Devices + +We can quickly deploy Stable Diffusion WebUI on Jetson devices using the jetson-examples tool. This project allows users to load the Stable Diffusion model and configure workflows through a graphical interface. + +**Step 1.** Install `jetson-examples` on your Jetson device Open a terminal on your Jetson device and enter: + +```bash +pip3 install jetson-examples +``` + +Step 2. Use the jetson-examples tool to install the stable-diffusion-webui project with one command: + +```bash +reComputer run stable-diffusion-webui +``` + +Step 3. Open your browser and go to `http://:7860`, and you can start generating images with Stable Diffusion. + +

+ test +

+ +## More Reference Content +- https://wiki.seeedstudio.com/How_to_run_local_llm_text_to_image_on_reComputer/ +- https://www.jetson-ai-lab.com/tutorial_stable-diffusion.html +- https://github.com/Seeed-Projects/jetson-examples/blob/main/reComputer/scripts/comfyui/README.md + + diff --git a/5-Generative-AI/5.7-Image-Generation-Models-and-Diffusion-Models/images/diffusion_model.png b/5-Generative-AI/5.7-Image-Generation-Models-and-Diffusion-Models/images/diffusion_model.png new file mode 100644 index 0000000..5cdba2a Binary files /dev/null and b/5-Generative-AI/5.7-Image-Generation-Models-and-Diffusion-Models/images/diffusion_model.png differ diff --git a/5-Generative-AI/5.7-Image-Generation-Models-and-Diffusion-Models/images/diffusion_model1.png b/5-Generative-AI/5.7-Image-Generation-Models-and-Diffusion-Models/images/diffusion_model1.png new file mode 100644 index 0000000..5c7d6e9 Binary files /dev/null and b/5-Generative-AI/5.7-Image-Generation-Models-and-Diffusion-Models/images/diffusion_model1.png differ diff --git a/5-Generative-AI/5.7-Image-Generation-Models-and-Diffusion-Models/images/test.png b/5-Generative-AI/5.7-Image-Generation-Models-and-Diffusion-Models/images/test.png new file mode 100644 index 0000000..6d249c5 Binary files /dev/null and b/5-Generative-AI/5.7-Image-Generation-Models-and-Diffusion-Models/images/test.png differ diff --git a/5-Generative-AI/5.7-Image-Generation-Models-and-Diffusion-Models/language/README_zh-CN.md b/5-Generative-AI/5.7-Image-Generation-Models-and-Diffusion-Models/language/README_zh-CN.md new file mode 100644 index 0000000..6a28510 --- /dev/null +++ b/5-Generative-AI/5.7-Image-Generation-Models-and-Diffusion-Models/language/README_zh-CN.md @@ -0,0 +1,56 @@ +# ęœ¬åœ°ę™ŗčƒ½é—®ē­”ē³»ē»Ÿ + +ęœ¬ę–‡å°†ę·±å…„ęŽ¢č®Øå›¾åƒē”Ÿęˆé¢†åŸŸäø­ēš„ę‰©ę•£ęØ”åž‹ļ¼Œē»“åˆå‰ę²æęŠ€ęœÆļ¼Œå¦‚ DALLĀ·E 和 Stable Diffusion ēš„å·„ä½œåŽŸē†ļ¼Œäŗ†č§£å®ƒä»¬ēš„å®žé™…åŗ”ē”ØäøŽå‘å±•ę½œåŠ›ć€‚ + +## å›¾åƒē”Ÿęˆäø­äøŽę‰©ę•£ęØ”åž‹ē®€ä»‹ + +

+ diffusion model +

+ +å›¾ē‰‡ę„ęŗļ¼šhttps://chrislee0728.medium.com/%E5%BE%9E%E9%A0%AD%E9%96%8B%E5%A7%8B%E5%AD%B8%E7%BF%92stable-diffusion-%E4%B8%80%E5%80%8B%E5%88%9D%E5%AD%B8%E8%80%85%E6%8C%87%E5%8D%97-ec34d7726a6c + +ę‰©ę•£ęØ”åž‹ę˜Æäø€ē§ē”ŸęˆęØ”åž‹ļ¼Œå®ƒé€ščæ‡åå‘ę‰©ę•£čæ‡ēØ‹ę„ē”Ÿęˆę–°å›¾åƒć€‚ę‰©ę•£ęØ”åž‹ä»Žé«˜ę–Æå™Ŗå£°å¼€å§‹ļ¼Œé€ę­„ęŽØę–­å‡ŗåŽŸå§‹å›¾åƒć€‚čæ™äøŖčæ‡ēØ‹åÆä»„ēœ‹ä½œę˜ÆęØ”ę‹Ÿå›¾åƒē”Ÿęˆēš„é€ęøā€œåŽ»å™Ŗā€čæ‡ēØ‹ć€‚å®ƒę˜Æäø€ē§éžåøøå¼ŗå¤§äø”é«˜ę•ˆēš„ē”Ÿęˆę–¹å¼ļ¼Œčæ‘å¹“ę„å¾—åˆ°äŗ†å¹æę³›ēš„å…³ę³Øć€‚ + +## Stable Diffusion ēš„å·„ä½œęœŗåˆ¶ + +Stable Diffusion ēš„å·„ä½œåŽŸē†åŸŗäŗŽę‰©ę•£ęØ”åž‹ēš„åå‘čæ‡ēØ‹ć€‚ę‰©ę•£ęØ”åž‹ēš„åŸŗęœ¬ęƒ³ę³•ę˜Æä»Žäø€å¹…å®Œå…Øéšęœŗēš„å™Ŗå£°å›¾åƒå¼€å§‹ļ¼Œé€ę­„ęŽØę–­å‡ŗęø…ę™°ēš„å›¾åƒć€‚čæ™äøŖčæ‡ēØ‹åˆ†äøŗäø¤äøŖé˜¶ę®µļ¼š + +1. ę­£å‘ę‰©ę•£čæ‡ēØ‹ļ¼ˆForward Diffusion Process) +åœØę­£å‘ę‰©ę•£čæ‡ēØ‹äø­ļ¼Œē»™å®šäø€å¹…ēœŸå®žå›¾åƒļ¼Œå®ƒä¼šé€ę­„åŠ å…„å™Ŗå£°ļ¼Œē›“åˆ°å˜ęˆå®Œå…Øéšęœŗēš„å™Ŗå£°ć€‚čæ™äøŖčæ‡ēØ‹åÆä»„ē†č§£äøŗā€œē “åā€å›¾åƒēš„ę­„éŖ¤ć€‚å™Ŗå£°é€šåøøę˜Æé€ščæ‡é«˜ę–Æåˆ†åøƒé€ę­„ę·»åŠ ēš„ļ¼ŒęÆäø€ę­„éƒ½ä¼šä½æå›¾åƒå˜å¾—ę›“åŠ ęØ”ē³Šå’Œå™Ŗå£°åŒ–ć€‚ +2. åå‘ę‰©ę•£čæ‡ēØ‹ļ¼ˆReverse Diffusion Process) +åå‘ę‰©ę•£čæ‡ēØ‹ę˜ÆStable Diffusionēš„ę øåæƒå·„ä½œéƒØåˆ†ć€‚ē»™å®šäø€äøŖå®Œå…Øéšęœŗēš„å™Ŗå£°å›¾åƒļ¼ŒęØ”åž‹é€ščæ‡å­¦ä¹ å¦‚ä½•é€ę­„åŽ»å™Ŗļ¼Œå°†å…¶čæ˜åŽŸäøŗåŽŸå§‹å›¾åƒć€‚åå‘čæ‡ēØ‹äøŽę­£å‘čæ‡ēØ‹ē›øåļ¼Œå®ƒé€ščæ‡äø€äøŖé¢„č®­ē»ƒēš„ē„žē»ē½‘ē»œęØ”åž‹åœØęÆäø€ę­„äø­ęŽØę–­å‡ŗę›“ęø…ę™°ēš„å›¾åƒć€‚åå‘čæ‡ēØ‹ēš„ęÆäø€ę­„é€ščæ‡ęØ”åž‹é¢„ęµ‹å½“å‰å™Ŗå£°ēš„å¤§å°ļ¼Œå¹¶å°†å…¶ä»Žå½“å‰å›¾åƒäø­å‡åŽ»ļ¼Œä»Žč€Œå¾—åˆ°ę›“å°‘å™Ŗå£°ēš„å›¾åƒć€‚ęœ€ē»ˆļ¼ŒåœØē»čæ‡å¤šäøŖåå‘ę­„éŖ¤åŽļ¼ŒęØ”åž‹åÆä»„ä»ŽēŗÆå™Ŗå£°ē”Ÿęˆäø€å¹…ęø…ę™°äø”é«˜åˆ†č¾ØēŽ‡ēš„å›¾åƒć€‚ + +

+ diffusion model +

+ +å›¾ē‰‡ę„ęŗļ¼šhttps://sushant-kumar.com/blog/ddpm-denoising-diffusion-probabilistic-models + +## 在 Jetson č®¾å¤‡äø­éƒØē½²SDęØ”åž‹ + +ęˆ‘ä»¬åÆä»„åˆ©ē”Ø jetson-examples å·„å…·åæ«é€ŸåœØ Jetson č®¾å¤‡äø­éƒØē½² Stable Diffusion WebUIć€‚čÆ„é”¹ē›®å…č®øē”Øęˆ·é€ščæ‡å›¾å½¢ē•Œé¢åŠ č½½ Stable Diffusion ęØ”åž‹å¹¶é…ē½®å·„ä½œęµēØ‹ć€‚ + +**step1.** 在 Jetson 设备中安装 jetson-examples +在 jetson č®¾å¤‡äø­ę‰“å¼€ē»ˆē«Æå¹¶č¾“å…„ļ¼š + +```bash +pip3 install jetson-examples +``` + +**step2.** 使用 `jetson-examples` 巄具一键安装 stable-diffusion-webui 锹目 + +```bash +reComputer run stable-diffusion-webui +``` + +**step3.** ęµč§ˆå™Øę‰“å¼€ `http://:7860`ļ¼Œå³åÆä½æē”Ø Stable Diffusion ē”Ÿęˆå›¾ē‰‡ + +

+ test +

+ +## ę›“å¤šå‚č€ƒå†…å®¹ +- https://wiki.seeedstudio.com/How_to_run_local_llm_text_to_image_on_reComputer/ +- https://www.jetson-ai-lab.com/tutorial_stable-diffusion.html +- https://github.com/Seeed-Projects/jetson-examples/blob/main/reComputer/scripts/comfyui/README.md diff --git a/5-Generative-AI/README.md b/5-Generative-AI/README.md new file mode 100644 index 0000000..d84434c --- /dev/null +++ b/5-Generative-AI/README.md @@ -0,0 +1,11 @@ +## šŸ“š Table of Computer Vision Applications + +| **Chapter** | **Content** | +|:-----------:|:------------------------------------------------:| +| Module 5.1| [Introduction-to-Generative-AI](./5.1-Introduction-to-Generative-AI/README.md)| +| Module 5.2| [Generative-Adversarial-Network](./5.2-Generative-Adversarial-Network/README.md)| +| Module 5.3| [Variational-Autoencoders](./5.3-Variational-Autoencoders/README.md)| +| Module 5.4| [Autoregressive-Models-and-Generative-Text-Models](./5.4-Autoregressive-Models-and-Generative-Text-Models/README.md)| +| Module 5.5| [The-Evolution-of-Transformer-Models-and-Generative-AI](./5.5-The-Evolution-of-Transformer-Models-and-Generative-AI/README.md)| +| Module 5.6| [Local-Intelligent-Q&A-System](./5.6-Local-Intelligent-Q&A-System/README.md)| +| Module 5.7| [Image-Generation-Models-and-Diffusion-Models](./5.7-Image-Generation-Models-and-Diffusion-Models/README.md)| diff --git a/6-Robotics/6.1-Introduction to ROS/6.1.1-Overview of ROS and Environment Setup/README.md b/6-Robotics/6.1-Introduction to ROS/6.1.1-Overview of ROS and Environment Setup/README.md new file mode 100644 index 0000000..33bd03b --- /dev/null +++ b/6-Robotics/6.1-Introduction to ROS/6.1.1-Overview of ROS and Environment Setup/README.md @@ -0,0 +1,257 @@ +# Overview of ROS and Environment Setup + +## Introduction + +This tutorial provides a concise overview of ROS (Robot Operating System) and guides you through the quick installation and experience of ROS on the [reComputer J3010 Nvidia Jetson Orin Nano](https://www.seeedstudio.com/reComputer-J3010-w-o-power-adapter-p-5631.html). By the end of this tutorial, you will have a working ROS environment and will be able to run a simple ROS demo. + +### Prerequisites + +To follow this tutorial, you will need the following hardware and software: +- **Hardware:** [reComputer J3010 (Nvidia Jetson Orin Nano)]((https://www.seeedstudio.com/reComputer-J3010-w-o-power-adapter-p-5631.html)), display, keyboard, and mouse. +- **Software:** Jetpack 5.1.1, Ubuntu 20.04, ROS Noetic, Python and C++. +

+ + J3010 + +

+ +## Introduction to the Development of ROS + +### [What is ROS?](https://vimeo.com/639236696) + +ROS (Robot Operating System) is an open-source framework for robot software development. It provides a structured communications layer above the host operating systems of a heterogeneous compute cluster. ROS is designed to be as thin as possible and consists of two parts: + +- **ROS system (ROS):** The plumbing system that handles communication between processes. It is the middleware that allows different robot parts to communicate with each other. +- **ROS packages:** Libraries and tools needed to write robot applications. +

+ + J3010 + +

+ +### [Why ROS?](https://www.ros.org/blog/why-ros/) + +ROS simplifies the process of creating complex and robust robot behavior across a wide variety of robotic platforms. Some of the key objectives and advantages of using ROS include: + +- **Rapid Development:** Provides a standard platform that accelerates development from research to production. +- **Global Community:** Supported by a large, active community contributing to and improving the software. +- **Proven Track Record:** Widely used in academic and commercial robotics. +- **Time to Market:** Helps reduce product development time with comprehensive tools and libraries. +- **Versatility:** Supports multiple domains and platforms, including embedded systems. +- **Open Source:** Free to use, modify, and extend, promoting innovation and collaboration. +- **Commercial Friendly:** Distributed under permissive licenses like Apache 2.0. + +### History and Development of ROS + +The history of ROS (Robot Operating System) is intertwined with the broader evolution of robotics: + +

+ + J3010 + +

+ +#### Early Robotics Developments +- **1959:** The journey of robotics began with the development of the first automated robot. +- **1972:** Emergence of robots capable of interacting with their environment. +- **1982:** Integration of robots into computer systems for complex tasks. +- **1988:** Advances in robotic automation and control systems. +- **2002:** Introduction of consumer robots like the Roomba for household chores. +- **2003:** Expansion of robotic exploration to Mars with rovers. +- **2005:** Crucial roles in industrial automation, exemplified by warehouse robots. + +### History and Development of ROS + +#### Early Robotics Developments +- **1959:** The journey of robotics began with the development of the first automated robot. +- **1972:** Emergence of robots capable of interacting with their environment. +- **1982:** Integration of robots into computer systems for complex tasks. +- **1988:** Advances in robotic automation and control systems. +- **2002:** Introduction of consumer robots like the Roomba for household chores. +- **2003:** Expansion of robotic exploration to Mars with rovers. +- **2005:** Crucial roles in industrial automation, exemplified by warehouse robots. + +#### The Birth of ROS +- **2007:** Development of ROS began under the name "Switchyard" at the Stanford Artificial Intelligence Laboratory by Morgan Quigley, Eric Berger, and Andrew Ng. It aimed to address the lack of shared software infrastructure in robotics research. +- **2008:** Development continued at Willow Garage, a robotics research lab. +- **2009:** Official establishment of ROS, marked by its first version release. + +#### Growth and Evolution +- **2010:** Release of ROS 1.0, with Willow Garage playing a crucial role in its development and community growth. +- **2014:** Introduction of social and service robots like Pepper, highlighting advancements in human-robot interaction. +- **2021:** ROS evolves to support sophisticated and versatile robotic systems for various applications. + +### [ROS Releases Timeline](https://docs.ros.org/en/rolling/Releases.html) + +| Release Name | Distribution | Release Date | EOL Date | +|--------------|--------------|--------------|----------------| +| Boxturtle | ROS 1 | March 2010 | March 2011 | +| C Turtle | ROS 1 | August 2010 | August 2011 | +| Diamondback | ROS 1 | March 2011 | November 2012 | +| Electric Emys| ROS 1 | August 2011 | January 2013 | +| Fuerte | ROS 1 | April 2012 | July 2013 | +| Groovy Galapagos| ROS 1 | December 2012| May 2014 | +| Hydro Medusa | ROS 1 | September 2013| May 2015 | +| Indigo Igloo | ROS 1 | July 2014 | April 2019 | +| Jade Turtle | ROS 1 | May 2015 | May 2017 | +| Kinetic Kame | ROS 1 | May 2016 | April 2021 | +| Lunar Loggerhead | ROS 1 | May 2017 | May 2019 | +| Melodic Morenia| ROS 1 | May 2018 | May 2023 | +| Noetic Ninjemys | ROS 1 | May 2020 | May 2025 | +| Foxy Fitzroy | ROS 2 | June 2020 | June 2023 | +| Galactic Geochelone | ROS 2 | May 2021 | November 2022 | +| Humble Hawksbill | ROS 2 | May 2022 | May 2027 | +| Rolling Ridley | ROS 2 | Ongoing | Ongoing | + +Today, ROS is maintained by Open Robotics, a non-profit organization dedicated to developing the core ROS system, including ROS 2.0, which incorporates improvements for real-time and embedded systems, along with other tools and libraries. + + + +## ROS Environment Installation and Quick Experience + +### Install ROS1 +- **Step 1:** Open Terminal and Update System Packages. + ```bash + sudo apt update + sudo apt upgrade + ``` +- **Step 2:** Install Basic Tools. + ```bash + sudo apt install curl gnupg2 lsb-release + ``` +- **Step 3:** Add ROS repository key. + ```bash + sudo curl -sSL https://raw.githubusercontent.com/ros/rosdistro/master/ros.asc | sudo apt-key add - + ``` +- **Step 4:** Add ROS repository. + ```bash + sudo sh -c 'echo "deb http://packages.ros.org/ros/ubuntu $(lsb_release -sc) main" > /etc/apt/sources.list.d/ros-latest.list' + ``` +- **Step 5:** Update package list. + ```bash + sudo apt update + ``` +- **Step 6:** Install ros-noetic-desktop-full. + ```bash + sudo apt install ros-noetic-desktop-full + sudo apt-get install python3-rosdep + ``` +- **Step 7:** Initialize rosdep. + ```bash + sudo rosdep init + rosdep update + ``` +- **Step 8:** Set Up ROS Environment Variables. + ```bash + echo "source /opt/ros/noetic/setup.bash">> ~/.bashrc && + source ~/.bashrc + ``` +- **Step 9:** Install Dependency Tools. + ```bash + sudo apt install python3-rosinstall python3-rosinstall-generator python3-wstool build-essential + ``` +- **Step 10:** Test the Installation. + ```bash + roscore + ``` +
+ +
+ +### Quick Start with ROS + +To quickly experience ROS, let's create a ROS workspace and run a simple demo. + +1. **Create a ROS workspace** + ```bash + mkdir -p ~/catkin_ws/src + cd ~/catkin_ws/ + catkin_make + ``` + +2. **Source the setup file** + ```bash + source devel/setup.bash + ``` + +3. **Run a demo** + ```bash + roscore + ``` + Open another terminal and run: + ```bash + rosrun turtlesim turtlesim_node + ``` + Open yet another terminal and run: + ```bash + rosrun turtlesim turtle_teleop_key + ``` +
+ +
+ +This quick demo shows a graphical turtle robot that you can control using the keyboard. + +### Installation of Common Development Software for ROS + +#### Installation of VScode and ROS Development Extensions +1. **For the installation of VSCode, please refer to the previous tutorial: [3.1-Python and Programming Fundamentals](https://github.com/Seeed-Projects/reComputer-Jetson-for-Beginners/blob/main/3-Basic-Tools-and-Getting-Started/3.1-Python-and-Programming-Fundamentals/README.md)** + +2. **Install tools such as `Python`, `ROS`, `C++`, and `CMake Tools` from the VSCode Extensions Marketplace.** +
+ +
+ +#### Install the Terminator multi-functional terminal. +1. **Install** + ```bash + sudo apt-get update + sudo apt install terminator + ``` +
+ +
+2. **Show Applications ---> Search for "Terminator" ---> Right-click and select "Add to Favorites"** + +3. **Common Terminator Shortcuts** + - **Alt + Up**: Move to the terminal above + - **Alt + Down**: Move to the terminal below + - **Alt + Left**: Move to the terminal on the left + - **Alt + Right**: Move to the terminal on the right + - **Ctrl + Shift + O**: Split terminal horizontally + - **Ctrl + Shift + E**: Split terminal vertically + - **Ctrl + Shift + Right**: Move the splitter to the right in a vertically split terminal + - **Ctrl + Shift + Left**: Move the splitter to the left in a vertically split terminal + - **Ctrl + Shift + Up**: Move the splitter up in a horizontally split terminal + - **Ctrl + Shift + Down**: Move the splitter down in a horizontally split terminal + - **Ctrl + Shift + S**: Hide/Show the scroll bar + - **Ctrl + Shift + F**: Search + - **Ctrl + Shift + C**: Copy selected content to clipboard + - **Ctrl + Shift + V**: Paste clipboard content + - **Ctrl + Shift + W**: Close the current terminal + - **Ctrl + Shift + Q**: Quit the current window, closing all terminals within it + - **Ctrl + Shift + X**: Maximize the current terminal + - **Ctrl + Shift + Z**: Maximize the current terminal and enlarge the font + - **Ctrl + Shift + N or Ctrl + Tab**: Move to the next terminal + - **Ctrl + Shift + P or Ctrl + Shift + Tab**: Move to the previous terminal + - **F11**: Toggle full screen + - **Ctrl + Shift + T**: Open a new tab + - **Ctrl + PageDown**: Move to the next tab + - **Ctrl + PageUp**: Move to the previous tab + - **Ctrl + Shift + PageDown**: Swap the current tab with the next tab + - **Ctrl + Shift + PageUp**: Swap the current tab with the previous tab + - **Ctrl + Plus (+)**: Increase font size + - **Ctrl + Minus (-)**: Decrease font size + - **Ctrl + Zero (0)**: Reset font size to the original + - **Ctrl + Shift + R**: Reset terminal state + - **Ctrl + Shift + G**: Reset terminal state and clear the screen + - **Super + g**: Bind all terminals, allowing input to be mirrored across all terminals + - **Super + Shift + G**: Unbind all terminals + - **Super + t**: Bind all terminals in the current tab, mirroring input across them + - **Super + Shift + T**: Unbind terminals in the current tab + - **Ctrl + Shift + I**: Open a new window, sharing the process with the original window + - **Super + i**: Open a new window with a separate process from the original window \ No newline at end of file diff --git a/6-Robotics/6.1-Introduction to ROS/6.1.1-Overview of ROS and Environment Setup/images/Development-history-of-mobile-robot.png b/6-Robotics/6.1-Introduction to ROS/6.1.1-Overview of ROS and Environment Setup/images/Development-history-of-mobile-robot.png new file mode 100644 index 0000000..191aa07 Binary files /dev/null and b/6-Robotics/6.1-Introduction to ROS/6.1.1-Overview of ROS and Environment Setup/images/Development-history-of-mobile-robot.png differ diff --git a/6-Robotics/6.1-Introduction to ROS/6.1.1-Overview of ROS and Environment Setup/images/terminator.png b/6-Robotics/6.1-Introduction to ROS/6.1.1-Overview of ROS and Environment Setup/images/terminator.png new file mode 100644 index 0000000..90cde18 Binary files /dev/null and b/6-Robotics/6.1-Introduction to ROS/6.1.1-Overview of ROS and Environment Setup/images/terminator.png differ diff --git a/6-Robotics/6.1-Introduction to ROS/6.1.1-Overview of ROS and Environment Setup/images/turtle.png b/6-Robotics/6.1-Introduction to ROS/6.1.1-Overview of ROS and Environment Setup/images/turtle.png new file mode 100644 index 0000000..40d9509 Binary files /dev/null and b/6-Robotics/6.1-Introduction to ROS/6.1.1-Overview of ROS and Environment Setup/images/turtle.png differ diff --git a/6-Robotics/6.1-Introduction to ROS/6.1.1-Overview of ROS and Environment Setup/images/vscode_plugs.png b/6-Robotics/6.1-Introduction to ROS/6.1.1-Overview of ROS and Environment Setup/images/vscode_plugs.png new file mode 100644 index 0000000..67405f5 Binary files /dev/null and b/6-Robotics/6.1-Introduction to ROS/6.1.1-Overview of ROS and Environment Setup/images/vscode_plugs.png differ diff --git a/6-Robotics/6.1-Introduction to ROS/6.1.2-Quick Experience with HelloWorld for ROS/README.md b/6-Robotics/6.1-Introduction to ROS/6.1.2-Quick Experience with HelloWorld for ROS/README.md new file mode 100644 index 0000000..8bba04a --- /dev/null +++ b/6-Robotics/6.1-Introduction to ROS/6.1.2-Quick Experience with HelloWorld for ROS/README.md @@ -0,0 +1,231 @@ +# HelloWorld Implementation Overview + +ROS programming primarily uses C++ and Python. Most programs can be implemented in both languages. Each tutorial will demonstrate examples in both C++ and Python, allowing users to choose the implementation that suits them best. + +The general implementation process in ROS is similar across different languages. Taking the HelloWorld program as an example, the steps are: + +1. Create a workspace. +2. Create a package. +3. Edit source files. +4. Edit configuration files. +5. Compile and execute. + +The main difference between C++ and Python lies in steps 3 and 4. The following sections detail the common steps for both implementations, with specific sections for C++ and Python. + +## HelloWorld (C++ Version) + +1. **Create and Initialize Workspace** + ```bash + mkdir -p /src + cd + catkin_make + ``` + For example: + ```bash + mkdir -p seeed_ws/src + cd seeed_ws + catkin_make + ``` + +2. **Create ROS Package and Add Dependencies** + ```bash + cd src + catkin_create_pkg roscpp rospy std_msgs + ``` + For example: + ```bash + cd src + catkin_create_pkg hello_world roscpp rospy std_msgs + ``` +3. **Edit Source File** + + Navigate to your package’s `src` directory and create a new C++ source file (e.g., `hello.cpp`): + ```bash + cd ~//src//src + touch hello.cpp + ``` + for example: + ```bash + cd ~/seeed_ws/src/hello_world/src + touch hello.cpp + ``` + copy flowing code into `hello.cpp`: + ```cpp + #include "ros/ros.h" + + int main(int argc, char *argv[]) + { + ros::init(argc, argv, "hello"); + ros::NodeHandle n; + ROS_INFO("Hello World!"); + + return 0; + } + ``` +

+ + J3010 + +

+ +4. **Edit `CMakeLists.txt`** + + Add the following in the end of your package's `CMakeLists.txt`: + ```cmake + add_executable( src/hello.cpp) + target_link_libraries( ${catkin_LIBRARIES}) + ``` + for example: + ```cmake + add_executable(hello src/hello.cpp) + target_link_libraries(hello ${catkin_LIBRARIES}) + ``` +

+ + J3010 + +

+ +

+ + J3010 + +

+ + **Note:The `CMakeLists.txt` file mentioned here is located in the created package directory, not in the workspace directory.** + +5. **Compile the Workspace** + ```bash + cd + catkin_make + ``` + for example: + ```bash + cd ~/seeed_ws + catkin_make + ``` +6. **Run the Program** + + Open one terminal and start ROS core: + ```bash + roscore + ``` + Open another terminal, source the workspace, and run the node: + ```bash + cd + source devel/setup.bash + rosrun hello + ``` + for example: + ```bash + cd ~/seeed_ws + source devel/setup.bash + rosrun hello_world hello + ``` +

+ + J3010 + +

+ +You should see the output: `Hello World!` + +## HelloWorld (Python Version) + +1. **Create and Initialize Workspace** + ```bash + mkdir -p /src + cd + catkin_make + ``` + For example: + ```bash + mkdir -p seeed_ws/src + cd seeed_ws + catkin_make + ``` + +2. **Create ROS Package and Add Dependencies** + ```bash + cd src + catkin_create_pkg roscpp rospy std_msgs + ``` + For example: + ```bash + cd src + catkin_create_pkg hello_world roscpp rospy std_msgs + ``` +3. **Add `scripts` Directory and Create Python File** + + Navigate to your package directory, create a `scripts` directory, and a new Python file (e.g., `hello.py`): + + for example: + ```bash + mkdir ~/seeed_ws/src/hello_world/scripts + cd ~/seeed_ws/src/hello_world/scripts + touch hello.py + ``` + + ```python + #!/usr/bin/env python + + import rospy + + if __name__ == "__main__": + rospy.init_node("hello") + rospy.loginfo("Hello World!") + ``` + +4. **Add Executable Permissions** + ```bash + sudo chmod +x hello.py + ``` + +5. **Edit `CMakeLists.txt`** + + Add the following in the end of your package's `CMakeLists.txt`: + ```cmake + catkin_install_python(PROGRAMS scripts/hello.py + DESTINATION ${CATKIN_PACKAGE_BIN_DESTINATION} + ) + ``` +

+ + J3010 + +

+6. **Compile the Workspace** + ```bash + cd + catkin_make + ``` + for example: + ```bash + cd ~/seeed_ws + catkin_make + ``` +7. **Run the Program** + Open one terminal and start ROS core: + ```bash + roscore + ``` + Open another terminal, source the workspace, and run the node: + ```bash + cd + source devel/setup.bash + rosrun hello.py + ``` + For example: + ```bash + cd ~/seeed_ws + source devel/setup.bash + rosrun hello_world hello.py + ``` + You should see the output: `Hello World!` + +## Note +To make sourcing the workspace setup file more convenient, add it to your `.bashrc`: +```bash +echo "source ~//devel/setup.bash" >> ~/.bashrc +``` +This ensures that the workspace is sourced automatically whenever a new terminal is opened. \ No newline at end of file diff --git a/6-Robotics/6.1-Introduction to ROS/6.1.2-Quick Experience with HelloWorld for ROS/images/cmakelists.png b/6-Robotics/6.1-Introduction to ROS/6.1.2-Quick Experience with HelloWorld for ROS/images/cmakelists.png new file mode 100644 index 0000000..02b0d04 Binary files /dev/null and b/6-Robotics/6.1-Introduction to ROS/6.1.2-Quick Experience with HelloWorld for ROS/images/cmakelists.png differ diff --git a/6-Robotics/6.1-Introduction to ROS/6.1.2-Quick Experience with HelloWorld for ROS/images/cmakelists_dir.png b/6-Robotics/6.1-Introduction to ROS/6.1.2-Quick Experience with HelloWorld for ROS/images/cmakelists_dir.png new file mode 100644 index 0000000..54c1ec1 Binary files /dev/null and b/6-Robotics/6.1-Introduction to ROS/6.1.2-Quick Experience with HelloWorld for ROS/images/cmakelists_dir.png differ diff --git a/6-Robotics/6.1-Introduction to ROS/6.1.2-Quick Experience with HelloWorld for ROS/images/hello_world_c.png b/6-Robotics/6.1-Introduction to ROS/6.1.2-Quick Experience with HelloWorld for ROS/images/hello_world_c.png new file mode 100644 index 0000000..8757885 Binary files /dev/null and b/6-Robotics/6.1-Introduction to ROS/6.1.2-Quick Experience with HelloWorld for ROS/images/hello_world_c.png differ diff --git a/6-Robotics/6.1-Introduction to ROS/6.1.2-Quick Experience with HelloWorld for ROS/images/hello_world_result_c.png b/6-Robotics/6.1-Introduction to ROS/6.1.2-Quick Experience with HelloWorld for ROS/images/hello_world_result_c.png new file mode 100644 index 0000000..dfe70cc Binary files /dev/null and b/6-Robotics/6.1-Introduction to ROS/6.1.2-Quick Experience with HelloWorld for ROS/images/hello_world_result_c.png differ diff --git a/6-Robotics/6.1-Introduction to ROS/6.1.3-ROS Architecture/README.md b/6-Robotics/6.1-Introduction to ROS/6.1.3-ROS Architecture/README.md new file mode 100644 index 0000000..53043a9 --- /dev/null +++ b/6-Robotics/6.1-Introduction to ROS/6.1.3-ROS Architecture/README.md @@ -0,0 +1,154 @@ +### 6.3.1 ROS Architecture + +#### ROS Filesystem + +The ROS filesystem structure on the hard disk is organized as follows: + +

+ + J3010 + +

+ +``` +WorkSpace --- Custom workspace + + |--- build: Compilation space for storing CMake and catkin cache, configuration, and other intermediate files. + |--- devel: Development space for storing compiled target files including headers, dynamic & static libraries, executables, etc. + |--- src: Source code + + |-- package: ROS package (basic ROS unit) containing multiple nodes, libraries, and configuration files. Package names should be lowercase, consisting of letters, numbers, and underscores. + |-- CMakeLists.txt: Configuration for compiling rules, including source files, dependencies, and target files. + |-- package.xml: Package information such as name, version, author, dependencies. + |-- scripts: Directory for Python files. + |-- src: Directory for C++ source files. + |-- include: Header files. + |-- msg: Message communication format files. + |-- srv: Service communication format files. + |-- action: Action format files. + |-- launch: Launch files for running multiple nodes at once. + |-- config: Configuration files. + + |-- CMakeLists.txt: Basic configuration for compilation. +``` + +Some of these directories and files have already been discussed, such as package creation, writing C++ and Python files in the `src` and `scripts` directories, and creating launch files in the `launch` directory. The `package.xml` and `CMakeLists.txt` files have also been configured. Other directories will be introduced in later tutorials. + +#### package.xml + +The `package.xml` file defines the properties of the package, such as name, version, author, maintainer, and dependencies. The format is as follows: + +```xml + + + hello_world + 0.0.0 + The hello_world package + xuzuo + TODO + catkin + roscpp + rospy + std_msgs + roscpp + rospy + std_msgs + roscpp + rospy + std_msgs + + + +``` + +#### CMakeLists.txt + +The `CMakeLists.txt` file is the input to the CMake build system and is used to build the package. It includes configuration for compiling C++ and Python files, as well as defining dependencies. + +```cmake +cmake_minimum_required(VERSION 3.0.2) +project(demo01_hello_vscode) + +find_package(catkin REQUIRED COMPONENTS + roscpp + rospy + std_msgs +) + +catkin_package( +) + +include_directories( + ${catkin_INCLUDE_DIRS} +) + +add_executable(hellow src/hello.cpp) + +add_dependencies(hellow ${${PROJECT_NAME}_EXPORTED_TARGETS} ${catkin_EXPORTED_TARGETS}) + +target_link_libraries(hellow + ${catkin_LIBRARIES} +) + +catkin_install_python(PROGRAMS + scripts/hello.py + DESTINATION ${CATKIN_PACKAGE_BIN_DESTINATION} +) +``` + +#### ROS Filesystem Commands + +Common commands for interacting with the ROS filesystem: + +- **Create Package:** `catkin_create_pkg ...` +- **Install Package:** `sudo apt install ` +- **Remove Package:** `sudo apt purge ` +- **List Packages:** `rospack list` +- **Find Package:** `rospack find ` +- **Navigate to Package:** `roscd ` +- **List Package Files:** `rosls ` +- **Search Package:** `apt search ` +- **Edit Package File:** `rosed ` + +#### Executing ROS Commands + +- **Start ROS Core:** `roscore` +- **Run ROS Node:** `rosrun ` +- **Launch ROS File:** `roslaunch ` + +#### ROS Computational Graph + +The computational graph in ROS represents the runtime structure of a ROS system, showing the data flow between different nodes. It can be visualized using `rqt_graph`: + +```bash +rosrun rqt_graph rqt_graph +``` + +If not installed: +```bash +sudo apt install ros--rqt +sudo apt install ros--rqt-common-plugins +``` + +Replace `` with your ROS version (e.g., kinetic, melodic, noetic). + +### Computational Graph Demonstration + +Next, we'll demonstrate the computational graph using ROS's built-in turtle simulation. + +1. **Run the Example:** + Follow the previous instructions to run the turtle simulation. + +2. **View the Computational Graph:** + Open a new terminal and enter: + ```bash + rqt_graph + ``` + or + ```bash + rosrun rqt_graph rqt_graph + ``` + +You will see a network topology graph that displays the relationships between different nodes, similar to the image below. + + ![Computational Graph Example](./images/computatioinal.png) \ No newline at end of file diff --git a/6-Robotics/6.1-Introduction to ROS/6.1.3-ROS Architecture/images/computatioinal.png b/6-Robotics/6.1-Introduction to ROS/6.1.3-ROS Architecture/images/computatioinal.png new file mode 100644 index 0000000..3f0a15f Binary files /dev/null and b/6-Robotics/6.1-Introduction to ROS/6.1.3-ROS Architecture/images/computatioinal.png differ diff --git a/6-Robotics/6.1-Introduction to ROS/6.1.3-ROS Architecture/images/filesystem.jpg b/6-Robotics/6.1-Introduction to ROS/6.1.3-ROS Architecture/images/filesystem.jpg new file mode 100644 index 0000000..ea44b12 Binary files /dev/null and b/6-Robotics/6.1-Introduction to ROS/6.1.3-ROS Architecture/images/filesystem.jpg differ diff --git a/6-Robotics/6.1-Introduction to ROS/6.1.4-ROS Communication Mechanism/README.md b/6-Robotics/6.1-Introduction to ROS/6.1.4-ROS Communication Mechanism/README.md new file mode 100644 index 0000000..a19059b --- /dev/null +++ b/6-Robotics/6.1-Introduction to ROS/6.1.4-ROS Communication Mechanism/README.md @@ -0,0 +1,935 @@ +# 6.1.4-ROS Communication Mechanism: + +## Topic Communication Tutorial + +### Introduction to Topic Communication + +In ROS, topic communication is one of the fundamental ways for nodes to exchange information. This tutorial will guide you through the process of setting up basic topic communication using both C++ and Python. We will implement a simple publisher-subscriber model where the publisher sends text messages at a frequency of 10Hz, and the subscriber receives and prints these messages. + +

+ + J3010 + + +#### 1. Theoretical Model + +Topic communication involves three main components: +- **ROS Master**: Manages the registration and connection of nodes. +- **Talker** (Publisher): Sends messages. +- **Listener** (Subscriber): Receives messages. + +The ROS Master helps establish connections between Talkers and Listeners. Here's a step-by-step breakdown of how the communication happens: + +- **Talker Registration**: The Talker registers itself with the ROS Master, including the topic name of its messages. +- **Listener Registration**: The Listener registers itself with the ROS Master, specifying the topic it wants to subscribe to. +- **Matching**: The ROS Master matches the Talker and Listener based on the topic name and sends the necessary connection information. +- **Connection Establishment**: The Listener requests a connection to the Talker, and the Talker confirms it. +- **Message Exchange**: Once connected, the Talker starts sending messages to the Listener. + +**Key Points**: +- The ROS Master is only needed for establishing the connection. +- The communication continues even if the ROS Master is shut down after the connection is established. +- Multiple Talkers and Listeners can exist, and they can start in any order. + +#### 2. Basic Topic Communication Operations (C++) + +**Objective**: Create a publisher node that sends text messages at 10Hz and a subscriber node that prints the received messages. + +**Steps**: + +0. **[Create package](../6.1.2-Quick%20Experience%20with%20HelloWorld%20for%20ROS/README.md)** + ```bash + cd ~/seeed_ws/src/ + catkin_create_pkg listener_and_talker roscpp rospy std_msgs + cd ~/seeed_ws/src/listener_and_talker/src + touch listener.cpp talker.cpp + ``` + +1. **Publisher Implementation**: + + `talker.cpp` + ```cpp + #include "ros/ros.h" + #include "std_msgs/String.h" + #include + + int main(int argc, char *argv[]) { + // Set locale for printing messages in the local language + setlocale(LC_ALL, ""); + // Initialize the ROS node with a unique name + ros::init(argc, argv, "talker"); + // Create a ROS node handle + ros::NodeHandle nh; + // Create a publisher object + ros::Publisher pub = nh.advertise("chatter", 10); + + std_msgs::String msg; + std::string msg_front = "Hello Seeed"; + int count = 0; + ros::Rate r(10); // 10Hz + + while (ros::ok()) { + std::stringstream ss; + ss << msg_front << count; + msg.data = ss.str(); + pub.publish(msg); + ROS_INFO("Sent message: %s", msg.data.c_str()); + r.sleep(); + count++; + } + return 0; + } + ``` + +2. **Subscriber Implementation**: + + `listener.cpp` + ```cpp + #include "ros/ros.h" + #include "std_msgs/String.h" + + void doMsg(const std_msgs::String::ConstPtr& msg_p) { + ROS_INFO("Heard: %s", msg_p->data.c_str()); + } + + int main(int argc, char *argv[]) { + setlocale(LC_ALL, ""); + ros::init(argc, argv, "listener"); + ros::NodeHandle nh; + ros::Subscriber sub = nh.subscribe("chatter", 10, doMsg); + ros::spin(); + return 0; + } + ``` + +3. **CMakeLists.txt Configuration**: + + Add flowing code in the end of your packages's `CMakeLists.txt`: + ```cmake + add_executable(listener src/listener.cpp) + add_executable(talker src/talker.cpp) + + target_link_libraries(listener ${catkin_LIBRARIES}) + target_link_libraries(talker ${catkin_LIBRARIES}) + ``` +

+ + J3010 + +

+ +4. **Running the Code**: + - Open a terminal and start `roscore`: + ```bash + roscore + ``` + - In a new terminal, navigate to your workspace and run the publisher node: + ```bash + rosrun listener_and_talker listener + ``` + - In another terminal, run the subscriber node: + ```bash + rosrun listener_and_talker talker + ``` +

+ + J3010 + +

+ +

+ + J3010 + +

+ +You should see messages being published and received, displayed in the terminal. + +#### 3. Basic Topic Communication Operations (Python) + +**Objective**: Create a publisher node that sends text messages at 10Hz and a subscriber node that prints the received messages. + +**Steps**: + +0. **[Create package](../6.1.2-Quick%20Experience%20with%20HelloWorld%20for%20ROS/README.md)** + ```bash + cd ~/seeed_ws/src/ + catkin_create_pkg listener_and_talker roscpp rospy std_msgs + mkdir ~/seeed_ws/src/listener_and_talker/script + cd ~/seeed_ws/src/listener_and_talker/script + touch listener.py talker.py + ``` + +1. **Publisher Implementation**: + + `talker.py` + ```python + #!/usr/bin/env python + import rospy + from std_msgs.msg import String + + if __name__ == "__main__": + rospy.init_node("talker_p") + pub = rospy.Publisher("chatter", String, queue_size=10) + msg = String() + msg_front = "hello 你儽" + count = 0 + rate = rospy.Rate(10) # 10Hz + + while not rospy.is_shutdown(): + msg.data = msg_front + str(count) + pub.publish(msg) + rospy.loginfo("Sent message: %s", msg.data) + rate.sleep() + count += 1 + ``` + +2. **Subscriber Implementation**: + + `listener.py` + ```python + #!/usr/bin/env python + import rospy + from std_msgs.msg import String + + def doMsg(msg): + rospy.loginfo("Heard: %s", msg.data) + + if __name__ == "__main__": + rospy.init_node("listener_p") + sub = rospy.Subscriber("chatter", String, doMsg, queue_size=10) + rospy.spin() + ``` + +3. **Add Executable Permissions**: + ```bash + sudo chmod +x *.py + ``` + +4. **CMakeLists.txt Configuration**: + + Add flowing code in the end of your packages's `CMakeLists.txt`: + ```cmake + catkin_install_python(PROGRAMS + scripts/talker.py + scripts/listener.py + DESTINATION ${CATKIN_PACKAGE_BIN_DESTINATION} + ) + ``` + +5. **Running the Code**: + - Open a terminal and start `roscore`: + ```bash + roscore + ``` + - In a new terminal, navigate to your workspace and run the publisher node: + ```bash + rosrun listener_and_talker talker.py + ``` + - In another terminal, run the subscriber node: + ```bash + rosrun listener_and_talker listener.py + ``` + +You should see messages being published and received, displayed in the terminal. + +### ROS Topic Common Commands + +- `rostopic bw`: Display bandwidth usage of a topic +- `rostopic delay`: Display delay of a topic with a header +- `rostopic echo`: Print messages to the screen +- `rostopic find`: Find topics by type +- `rostopic hz`: Display publishing frequency of a topic +- `rostopic info`: Display information about a topic +- `rostopic list`: List all active topics +- `rostopic pub`: Publish data to a topic +- `rostopic type`: Print the type of a topic + +---- +## Introduction to Service Communication + +Service communication in ROS differs from topic communication by being bidirectional. It allows not only the sending of messages but also receiving feedback. This model consists of two main parts: +1. **Client**: The entity that sends a request. +2. **Server**: The entity that processes the request and sends back a response. + +When a client sends a request to a server, it waits for the server to process the request and return a response. This mechanism follows a "request-reply" structure, completing the communication. + +#### How it work? +- **Node B** (the server) provides a service interface, usually named something like `/service_name`. +- **Node A** (the client) sends a request to Node B. +- Node B processes the request and sends back a response. + +The communication process can be illustrated as follows: + +1. **Talker Node advertises a service via ROS Master:** + - The Talker node advertises a service (e.g., `advertiseService("bar", foo:1234)`) via the ROS Master, indicating its availability. + +2. **Listener Node looks up the service via ROS Master:** + - The Listener node sends a request to the ROS Master to find the service (e.g., `lookupService("bar")`). + +3. **ROS Master returns the service address:** + - The ROS Master responds with the service address (e.g., `foo:3456`) for the Listener node to connect. + +4. **Listener Node requests data from Talker Node:** + - The Listener node sends a service request to the Talker node, using XML/RPC for communication. + +5. **Talker Node replies with the requested data:** + - The Talker node processes the request and sends back the reply data over TCP. + +

+ + J3010 + + +**Key Points**: +- The client is blocked until it receives a response from the server. +- Service communication is efficient, as it only consumes resources when needed (i.e., when a request is made). + +#### Theoretical Model + +The service communication model involves three key components: + +1. **ROS Master**: Manages the registration of both servers and clients, helping to establish connections based on matching service names. +2. **Server**: Provides the service. +3. **Client**: Requests the service. + +**Process Overview**: + +1. **Server Registration**: + - The server registers itself with the ROS Master, including the service name it provides. + +2. **Client Registration**: + - The client registers itself with the ROS Master, specifying the service it wants to use. + +3. **Matching and Connection**: + - The ROS Master matches the client and server based on the service name and facilitates the connection. + +4. **Request-Response Cycle**: + - The client sends a request to the server, which processes the request and returns a response. + +#### Topic vs. Service Communication + +Let's compare these two most common ROS communication methods to deepen our understanding: + +| **Aspect** | **Topic Communication** | **Service Communication** | +|--------------------|-------------------------|---------------------------| +| Communication Type | Asynchronous | Synchronous | +| Protocol | TCP/IP | TCP/IP | +| Communication Model| Publish-Subscribe | Request-Reply | +| Relationship | Many-to-Many | One-to-Many | +| Characteristics | Callback-based | Remote Procedure Call (RPC)| +| Use Cases | Continuous, high-frequency data | Low-frequency, specific tasks | +| Example | Publishing LiDAR data | Triggering a sensor or taking a photo | + +**Note**: Remote Procedure Call (RPC) refers to executing a function on a different process as if it were local. + +#### 5. Creating a Custom Service (srv) in ROS + +Let's dive into a hands-on example where we create a custom service that sums two integers sent by the client. The server will process this request and return the sum to the client. + +**Steps to Implement**: +1. **Create a New Package** + ```bash + cd ~/seeed_ws/src + catkin_create_pkg service_communication roscpp rospy std_msgs + cd ~/seeed_ws + catkin_make + ``` +1. **Define the srv File**: + The `srv` file defines the structure of the request and response. In this case, the request will contain two integers, and the response will contain their sum. + Create a new directory called `srv` in your package and add a file named `AddInts.srv`: + ```bash + mkdir ~/seeed_ws/src/service_communication/srv + cd ~/seeed_ws/src/service_communication/srv + touch AddInts.srv + ``` + Copy flowing to `AddInts.srv`: + ```srv + int32 num1 + int32 num2 + --- + int32 sum + ``` +

+ + J3010 + +

+ +2. **Update the package.xml**: + Add the necessary dependencies for generating message files in package's `package.xml`: + ```xml + message_generation + message_runtime + ``` +

+ + J3010 + + +3. **Update CMakeLists.txt**: + - Include the necessary configurations to generate the service files in package's `CMakeLists.txt`: + ```cmake + find_package(catkin REQUIRED COMPONENTS + roscpp + rospy + std_msgs + message_generation + ) + + add_service_files( + FILES + AddInts.srv + ) + + generate_messages( + DEPENDENCIES + std_msgs + ) + ``` +

+ + J3010 + +

+ + +4. **Compile Your Package**: + - Compile your package to generate the service message headers: + ```bash + cd ~/seeed_ws + catkin_make + source devel/setup.bash + ``` + +### Implementing Service Communication (C++) + +This example demonstrates how to implement service communication in ROS using C++. We will create a simple service where the server adds two integers provided by the client and returns the sum. + +**1. Server Implementation:** + +`add_two_ints_server.cpp` +```cpp +#include "ros/ros.h" +#include "service_communication/AddInts.h" + +// Callback function to handle the client's request +bool add(service_communication::AddInts::Request &req, + service_communication::AddInts::Response &res) { + res.sum = req.num1 + req.num2; // Compute the sum + ROS_INFO("Request: a=%ld, b=%ld", (long int)req.num1, (long int)req.num2); + ROS_INFO("Sending back response: [%ld]", (long int)res.sum); + return true; +} + +int main(int argc, char **argv) { + ros::init(argc, argv, "add_two_ints_server"); + ros::NodeHandle nh; + + // Advertise the service to the ROS master + ros::ServiceServer service = nh.advertiseService("add_two_ints", add); + ROS_INFO("Ready to add two integers."); + ros::spin(); + + return 0; +} +``` + +**2. Client Implementation:** + +`add_two_ints_client.cpp` +```cpp +#include "ros/ros.h" +#include "service_communication/AddInts.h" +#include + +int main(int argc, char **argv) { + ros::init(argc, argv, "add_two_ints_client"); + if (argc != 3) { + ROS_INFO("Usage: add_two_ints_client X Y"); + return 1; + } + + ros::NodeHandle nh; + ros::ServiceClient client = nh.serviceClient("add_two_ints"); + + // Prepare the service request + service_communication::AddInts srv; + srv.request.num1 = atoll(argv[1]); + srv.request.num2 = atoll(argv[2]); + + // Call the service and check if it was successful + if (client.call(srv)) { + ROS_INFO("Sum: %ld", (long int)srv.response.sum); + } else { + ROS_ERROR("Failed to call service add_two_ints"); + return 1; + } + + return 0; +} +``` + +**CMakeLists.txt Configuration:** + +Make sure to add the following lines in the end of `CMakeLists.txt` to compile both the server and client: + +```cmake +add_executable(add_two_ints_server src/add_two_ints_server.cpp) +add_executable(add_two_ints_client src/add_two_ints_client.cpp) + +add_dependencies(add_two_ints_server ${${PROJECT_NAME}_EXPORTED_TARGETS} ${catkin_EXPORTED_TARGETS}) +add_dependencies(add_two_ints_client ${${PROJECT_NAME}_EXPORTED_TARGETS} ${catkin_EXPORTED_TARGETS}) + +target_link_libraries(add_two_ints_server ${catkin_LIBRARIES}) +target_link_libraries(add_two_ints_client ${catkin_LIBRARIES}) +``` + +#### Compile and run demo +Open one terminal: +```bash +cd ~/seeed_ws +catkin_make +roscore +``` +Open another terminal: +```bash +cd ~/seeed_ws +source devel/setup.bash + rosrun service_communication server +``` + +Open another terminal: +```bash +cd ~/seeed_ws +source devel/setup.bash +rosrun service_communication client 1 5 +``` + +

+ + J3010 + +

+ +#### Implementing Service Communication (Python) + +This Python example achieves the same functionality as the C++ example, where a service is used to add two integers. + +Create a `scripts` folder under the package and create `add_two_ints_server.py` and `add_two_ints_client.py` files inside it. +```bash +cd ~/seeed/src/service_communication/ +mkdir scripts +cd scripts +touch add_two_ints_server.py add_two_ints_client.py +``` + +**1. Server Implementation:** + +`add_two_ints_server.py` +```python +#!/usr/bin/env python +import rospy +from service_communication.srv import AddInts,AddIntsRequest, AddIntsResponse +def doReq(req): + sum = req.num1 + req.num2 + rospy.loginfo("data:num1 = %d, num2 = %d, sum = %d",req.num1, req.num2, sum) + resp = AddIntsResponse(sum) + return resp +if __name__ == "__main__": + rospy.init_node("addints_server_p") + server = rospy.Service("AddInts",AddInts,doReq) + rospy.spin() +``` + +**2. Client Implementation:** + +`add_two_ints_client.py` +```python +#!/usr/bin/env python + +import sys +import rospy +from service_communication.srv import * + +if __name__ == "__main__": + if len(sys.argv) != 3: + rospy.logerr("error") + sys.exit(1) + rospy.init_node("AddInts_Client_p") + client = rospy.ServiceProxy("AddInts",AddInts) + client.wait_for_service() + req = AddIntsRequest() + req.num1 = int(sys.argv[1]) + req.num2 = int(sys.argv[2]) + resp = client.call(req) + rospy.loginfo("result:%d",resp.sum) +``` + +**CMakeLists.txt Configuration:** + +Add the following lines in your `CMakeLists.txt` for the Python scripts: + +```cmake +catkin_install_python(PROGRAMS + scripts/add_two_ints_server.py + scripts/add_two_ints_client.py + DESTINATION ${CATKIN_PACKAGE_BIN_DESTINATION} +) +``` + +#### Compile and run demo +Open one terminal: +```bash +cd ~/seeed_ws +catkin_make +roscore +``` +Open another terminal: +```bash +cd ~/seeed_ws +source devel/setup.bash +rosrun service_communication add_two_ints_server.py +``` + +Open another terminal: +```bash +cd ~/seeed_ws +source devel/setup.bash +rosrun service_communication add_two_ints_client.py 1 5 +``` + + +#### Service Communication Commands + +To work with services in ROS, you'll use the `rosservice` command. Here's a list of common `rosservice` commands and their functions: + +- `rosservice args`: Print the arguments required by a service +- `rosservice call`: Call a service with the provided arguments +- `rosservice find`: Find services by type +- `rosservice info`: Print information about a service +- `rosservice list`: List all active services +- `rosservice type`: Print the type of a service +- `rosservice uri`: Print the ROSRPC URI of a service + + +--- +## Introduction to the ROS Parameter Server + +The ROS Parameter Server is a shared, multi-user, network-accessible storage space for parameters. It provides a way to store and retrieve parameters at runtime, which can be used to configure nodes or share data between them. Parameters on the server can be of various data types, including integers, booleans, strings, doubles, lists, and dictionaries. The Parameter Server is managed by the **ROS Master**, and nodes interact with it by setting, retrieving, or deleting parameters. + +### Theoretical Model of the Parameter Server + + +The Parameter Server involves three main roles: +1. **ROS Master**: Manages the Parameter Server, acting as a central storage for parameters. +2. **Talker**: A node that sets parameters on the server. +3. **Listener**: A node that retrieves parameters from the server. + + +The process of interacting with the Parameter Server typically involves the following steps: + +1. **Setting Parameters (Talker)**: + - The Talker node sends a parameter to the Parameter Server via RPC (Remote Procedure Call), including the parameter's name and value. The ROS Master stores this parameter in its list. + +2. **Retrieving Parameters (Listener)**: + - The Listener node requests a parameter from the Parameter Server by sending a query with the parameter's name. + +3. **Returning Parameters (ROS Master)**: + - The ROS Master searches for the requested parameter in its storage and returns the corresponding value to the Listener. + +

+ + J3010 + + +**Supported Data Types**: +- 32-bit integers +- Booleans +- Strings +- Doubles +- ISO8601 dates +- Lists +- Base64-encoded binary data +- Dictionaries + +### C++ Implementation + +**Setting Parameters (C++):** + +We'll start by setting various types of parameters on the Parameter Server using two different APIs: `ros::NodeHandle` and `ros::param`. + +`set_parameters.cpp` +```cpp +#include "ros/ros.h" + +int main(int argc, char *argv[]) { + ros::init(argc, argv, "set_parameters"); + + std::vector students = {"Alice", "Bob", "Charlie", "David"}; + std::map friends = {{"John", "Doe"}, {"Jane", "Smith"}}; + + // Using ros::NodeHandle to set parameters + ros::NodeHandle nh; + nh.setParam("int_param", 42); + nh.setParam("double_param", 3.14159); + nh.setParam("bool_param", true); + nh.setParam("string_param", "Hello ROS"); + nh.setParam("vector_param", students); + nh.setParam("map_param", friends); + + // Using ros::param to set parameters + ros::param::set("int_param_param", 84); + ros::param::set("double_param_param", 6.28318); + ros::param::set("bool_param_param", false); + ros::param::set("string_param_param", "Goodbye ROS"); + ros::param::set("vector_param_param", students); + ros::param::set("map_param_param", friends); + + return 0; +} +``` + Add flowing code in end of your package's `CMakeLists.txt`: + + ```cmake + add_executable(set_parameters src/set_parameters.cpp) + target_link_libraries(set_parameters ${catkin_LIBRARIES}) + ``` + +In this example: +- We set various types of parameters on the Parameter Server, including integers, doubles, booleans, strings, vectors, and maps. +- We used both `ros::NodeHandle` and `ros::param` APIs to set the parameters. + +**Retrieving Parameters (C++):** + +Next, we'll retrieve the parameters that we previously set on the Parameter Server. + +`get_parameters.cpp` +```cpp +#include "ros/ros.h" + +int main(int argc, char *argv[]) { + ros::init(argc, argv, "get_parameters"); + + // Using ros::NodeHandle to retrieve parameters + ros::NodeHandle nh; + int int_value; + double double_value; + bool bool_value; + std::string string_value; + std::vector students; + std::map friends; + + nh.getParam("int_param", int_value); + nh.getParam("double_param", double_value); + nh.getParam("bool_param", bool_value); + nh.getParam("string_param", string_value); + nh.getParam("vector_param", students); + nh.getParam("map_param", friends); + + ROS_INFO("Retrieved values:"); + ROS_INFO("int_param: %d", int_value); + ROS_INFO("double_param: %.5f", double_value); + ROS_INFO("bool_param: %d", bool_value); + ROS_INFO("string_param: %s", string_value.c_str()); + + for (const auto &student : students) { + ROS_INFO("Student: %s", student.c_str()); + } + + for (const auto &friend_pair : friends) { + ROS_INFO("Friend: %s = %s", friend_pair.first.c_str(), friend_pair.second.c_str()); + } + + return 0; +} +``` + +Add flowing code in end of your package's `CMakeLists.txt`: + +```cmake +add_executable(get_parameters src/get_parameters.cpp) +target_link_libraries(get_parameters ${catkin_LIBRARIES}) +``` + +In this example: +- We retrieve the parameters set on the server using the `ros::NodeHandle` API. +- The retrieved parameters are then printed to the ROS log for verification. + +**Deleting Parameters (C++):** + +Finally, let's see how to delete parameters from the Parameter Server. + +``delete_parameters.cpp`` +```cpp +#include "ros/ros.h" + +int main(int argc, char *argv[]) { + ros::init(argc, argv, "delete_parameters"); + + ros::NodeHandle nh; + bool success; + + // Using ros::NodeHandle to delete parameters + success = nh.deleteParam("int_param"); + ROS_INFO("Delete int_param: %s", success ? "Success" : "Failure"); + + // Using ros::param to delete parameters + success = ros::param::del("int_param_param"); + ROS_INFO("Delete int_param_param: %s", success ? "Success" : "Failure"); + + return 0; +} +``` +Add flowing code in end of your package's `CMakeLists.txt`: + +```cmake +add_executable(delete_parameters src/delete_parameters.cpp) +target_link_libraries(delete_parameters ${catkin_LIBRARIES}) +``` + +In this example: +- We use both `ros::NodeHandle` and `ros::param` APIs to delete parameters from the server. +- The success of the deletion is logged. + +### 2.2 Python Implementation + +**Setting Parameters (Python):** + +Let's now set parameters using Python. The process is very similar to the C++ version. + +```python +#!/usr/bin/env python + +import rospy + +if __name__ == "__main__": + rospy.init_node("set_parameters_py") + + # Setting various types of parameters + rospy.set_param("int_param", 42) + rospy.set_param("double_param", 3.14159) + rospy.set_param("bool_param", True) + rospy.set_param("string_param", "Hello ROS") + rospy.set_param("list_param", ["apple", "banana", "cherry"]) + rospy.set_param("dict_param", {"first_name": "John", "last_name": "Doe"}) + + # Modifying a parameter + rospy.set_param("int_param", 84) +``` + +Add flowing code in end of your package's `CMakeLists.txt`: + +```cmake +catkin_install_python(PROGRAMS + scripts/set_parameters_py.py + DESTINATION ${CATKIN_PACKAGE_BIN_DESTINATION} +) +``` + +In this example: +- We set various types of parameters, including integers, doubles, booleans, strings, lists, and dictionaries. +- We also demonstrate modifying an existing parameter. + +**Retrieving Parameters (Python):** + +Next, we'll retrieve the parameters that we set. + +```python +#!/usr/bin/env python + +import rospy + +if __name__ == "__main__": + rospy.init_node("get_parameters_py") + + # Retrieving parameters + int_value = rospy.get_param("int_param", 0) + double_value = rospy.get_param("double_param", 0.0) + bool_value = rospy.get_param("bool_param", False) + string_value = rospy.get_param("string_param", "") + list_value = rospy.get_param("list_param", []) + dict_value = rospy.get_param("dict_param", {}) + + rospy.loginfo("Retrieved values:") + rospy.loginfo("int_param: %d", int_value) + rospy.loginfo("double_param: %.5f", double_value) + rospy.loginfo("bool_param: %s", bool_value) + rospy.loginfo("string_param: %s", string_value) + rospy.loginfo("list_param: %s", list_value) + rospy.loginfo("dict_param: %s", dict_value) +``` + +Add flowing code in end of your package's `CMakeLists.txt`: +```cmake +catkin_install_python(PROGRAMS + scripts/set_parameters_py.py + scripts/get_parameters_py.py + DESTINATION ${CATKIN_PACKAGE_BIN_DESTINATION} +) +``` + + +In this example: +- We use `rospy.get_param` to retrieve parameters set on the server. +- The retrieved values are logged using `rospy.loginfo`. + +**Deleting Parameters (Python):** + +Finally, let's delete parameters from the Parameter Server using Python. + +```python +#!/usr/bin/env python + +import rospy + +if __name__ == "__main__": + rospy.init_node("delete_parameters_py") + + try: + rospy.delete_param("int_param") + rospy.loginfo("int_param deleted successfully.") + except KeyError: + rospy.logwarn("int_param does not exist.") + + try: + rospy.delete_param("non_existent_param") + rospy.loginfo("non_existent_param deleted successfully.") + except KeyError: + rospy.logwarn("non_existent_param does not exist.") +``` + +Add flowing code in end of your package's `CMakeLists.txt`: +```cmake +catkin_install_python(PROGRAMS + scripts/set_parameters_py.py + scripts/get_parameters_py.py + scripts/delete_parameters_py.py + DESTINATION ${CATKIN_PACKAGE_BIN_DESTINATION} +) +``` + +In this example: +- We attempt to delete a parameter and handle the case where the parameter does not exist using exception handling (`KeyError`). + +### ROS Parameter Server Common Commands +`rosparam` includes command-line tools for getting and setting ROS parameters on the parameter server, using YAML-encoded files. + +- `rosparam set`: Set a parameter +- `rosparam get`: Get a parameter +- `rosparam load`: Load parameters from an external file +- `rosparam dump`: Dump parameters to an external file +- `rosparam delete`: Delete a parameter +- `rosparam list`: List all parameters + +Examples: + +- `rosparam list`: List all parameters on the parameter server. +- `rosparam set `: Set a parameter with a specific value. +- `rosparam get `: Get the value of a specific parameter. +- `rosparam delete `: Delete a specific parameter. +- `rosparam load `: Load parameters from a YAML file. +- `rosparam dump `: Dump the current parameters to a YAML file. + + + diff --git a/6-Robotics/6.1-Introduction to ROS/6.1.4-ROS Communication Mechanism/images/CMakeLists.png b/6-Robotics/6.1-Introduction to ROS/6.1.4-ROS Communication Mechanism/images/CMakeLists.png new file mode 100644 index 0000000..2f39f35 Binary files /dev/null and b/6-Robotics/6.1-Introduction to ROS/6.1.4-ROS Communication Mechanism/images/CMakeLists.png differ diff --git a/6-Robotics/6.1-Introduction to ROS/6.1.4-ROS Communication Mechanism/images/Parameter_Server.png b/6-Robotics/6.1-Introduction to ROS/6.1.4-ROS Communication Mechanism/images/Parameter_Server.png new file mode 100644 index 0000000..e46aaec Binary files /dev/null and b/6-Robotics/6.1-Introduction to ROS/6.1.4-ROS Communication Mechanism/images/Parameter_Server.png differ diff --git a/6-Robotics/6.1-Introduction to ROS/6.1.4-ROS Communication Mechanism/images/Service.png b/6-Robotics/6.1-Introduction to ROS/6.1.4-ROS Communication Mechanism/images/Service.png new file mode 100644 index 0000000..45e7ed3 Binary files /dev/null and b/6-Robotics/6.1-Introduction to ROS/6.1.4-ROS Communication Mechanism/images/Service.png differ diff --git a/6-Robotics/6.1-Introduction to ROS/6.1.4-ROS Communication Mechanism/images/Topic.png b/6-Robotics/6.1-Introduction to ROS/6.1.4-ROS Communication Mechanism/images/Topic.png new file mode 100644 index 0000000..87eff46 Binary files /dev/null and b/6-Robotics/6.1-Introduction to ROS/6.1.4-ROS Communication Mechanism/images/Topic.png differ diff --git a/6-Robotics/6.1-Introduction to ROS/6.1.4-ROS Communication Mechanism/images/computatioinal.png b/6-Robotics/6.1-Introduction to ROS/6.1.4-ROS Communication Mechanism/images/computatioinal.png new file mode 100644 index 0000000..3f0a15f Binary files /dev/null and b/6-Robotics/6.1-Introduction to ROS/6.1.4-ROS Communication Mechanism/images/computatioinal.png differ diff --git a/6-Robotics/6.1-Introduction to ROS/6.1.4-ROS Communication Mechanism/images/filesystem.jpg b/6-Robotics/6.1-Introduction to ROS/6.1.4-ROS Communication Mechanism/images/filesystem.jpg new file mode 100644 index 0000000..ea44b12 Binary files /dev/null and b/6-Robotics/6.1-Introduction to ROS/6.1.4-ROS Communication Mechanism/images/filesystem.jpg differ diff --git a/6-Robotics/6.1-Introduction to ROS/6.1.4-ROS Communication Mechanism/images/package_xml.png b/6-Robotics/6.1-Introduction to ROS/6.1.4-ROS Communication Mechanism/images/package_xml.png new file mode 100644 index 0000000..86113af Binary files /dev/null and b/6-Robotics/6.1-Introduction to ROS/6.1.4-ROS Communication Mechanism/images/package_xml.png differ diff --git a/6-Robotics/6.1-Introduction to ROS/6.1.4-ROS Communication Mechanism/images/run_listener_and_talker.png b/6-Robotics/6.1-Introduction to ROS/6.1.4-ROS Communication Mechanism/images/run_listener_and_talker.png new file mode 100644 index 0000000..56bce4c Binary files /dev/null and b/6-Robotics/6.1-Introduction to ROS/6.1.4-ROS Communication Mechanism/images/run_listener_and_talker.png differ diff --git a/6-Robotics/6.1-Introduction to ROS/6.1.4-ROS Communication Mechanism/images/run_listener_and_talker_result.png b/6-Robotics/6.1-Introduction to ROS/6.1.4-ROS Communication Mechanism/images/run_listener_and_talker_result.png new file mode 100644 index 0000000..a367401 Binary files /dev/null and b/6-Robotics/6.1-Introduction to ROS/6.1.4-ROS Communication Mechanism/images/run_listener_and_talker_result.png differ diff --git a/6-Robotics/6.1-Introduction to ROS/6.1.4-ROS Communication Mechanism/images/run_service_c.png b/6-Robotics/6.1-Introduction to ROS/6.1.4-ROS Communication Mechanism/images/run_service_c.png new file mode 100644 index 0000000..57a0a67 Binary files /dev/null and b/6-Robotics/6.1-Introduction to ROS/6.1.4-ROS Communication Mechanism/images/run_service_c.png differ diff --git a/6-Robotics/6.1-Introduction to ROS/6.1.4-ROS Communication Mechanism/images/srv_cmakelists.png b/6-Robotics/6.1-Introduction to ROS/6.1.4-ROS Communication Mechanism/images/srv_cmakelists.png new file mode 100644 index 0000000..41cb7c3 Binary files /dev/null and b/6-Robotics/6.1-Introduction to ROS/6.1.4-ROS Communication Mechanism/images/srv_cmakelists.png differ diff --git a/6-Robotics/6.1-Introduction to ROS/6.1.4-ROS Communication Mechanism/images/srv_code.png b/6-Robotics/6.1-Introduction to ROS/6.1.4-ROS Communication Mechanism/images/srv_code.png new file mode 100644 index 0000000..f963e62 Binary files /dev/null and b/6-Robotics/6.1-Introduction to ROS/6.1.4-ROS Communication Mechanism/images/srv_code.png differ diff --git a/6-Robotics/6.1-Introduction to ROS/6.1.5-Common ROS Commands/README.md b/6-Robotics/6.1-Introduction to ROS/6.1.5-Common ROS Commands/README.md new file mode 100644 index 0000000..725806a --- /dev/null +++ b/6-Robotics/6.1-Introduction to ROS/6.1.5-Common ROS Commands/README.md @@ -0,0 +1,126 @@ +# Common ROS Commands + +In a robot system, there may be a few to dozens of nodes running simultaneously. Each node has a unique name, and they communicate using topics, services, messages, parameters, and more. A common challenge arises: when you need to customize a node to communicate with another existing node, how do you retrieve the topic and the message format being used by the other node? + +ROS provides a set of useful command-line tools to obtain various information about different nodes. The commonly used commands are as follows: + +- **rosnode**: Manage nodes +- **rostopic**: Manage topics +- **rosservice**: Manage services +- **rosmsg**: Manage message types (msg) +- **rossrv**: Manage service message types (srv) +- **rosparam**: Manage parameters + +These commands are dynamic, unlike the static file system commands. After the ROS program starts, these commands allow you to dynamically retrieve information about running nodes or parameters. + +## rosnode +`rosnode` is used to retrieve information about ROS nodes. + +- `rosnode ping`: Test the connectivity status to a node +- `rosnode list`: List all active nodes +- `rosnode info`: Print information about a node +- `rosnode machine`: List nodes running on a specific machine +- `rosnode kill`: Terminate a node +- `rosnode cleanup`: Clean up unreachable nodes + +## rostopic +`rostopic` includes command-line tools to display debugging information about ROS topics, such as publishers, subscribers, publishing frequency, and ROS messages. It also includes an experimental Python library for dynamically retrieving information about topics and interacting with them. + +- `rostopic bw`: Display bandwidth usage of a topic +- `rostopic delay`: Display delay of a topic with a header +- `rostopic echo`: Print messages to the screen +- `rostopic find`: Find topics by type +- `rostopic hz`: Display publishing frequency of a topic +- `rostopic info`: Display information about a topic +- `rostopic list`: List all active topics +- `rostopic pub`: Publish data to a topic +- `rostopic type`: Print the type of a topic + +Examples: + +- `rostopic list(-v)`: Print the names of topics currently running (with `-v` for detailed information like the number of publishers and subscribers). +- `rostopic pub /topic_name msg_type "msg_content"`: Publish a message to a topic. +- `rostopic echo /topic_name`: Retrieve and print the current message being published on a topic. +- `rostopic info /topic_name`: Get detailed information about a topic, including message type, publisher, and subscriber information. +- `rostopic hz /topic_name`: Display the publishing frequency of a topic. +- `rostopic bw /topic_name`: Display the bandwidth usage of a topic. + +## 2.4.3 rosmsg +`rosmsg` is a command-line tool for displaying information about ROS message types. + +- `rosmsg show`: Display the description of a message +- `rosmsg info`: Display detailed information about a message +- `rosmsg list`: List all message types +- `rosmsg md5`: Display the MD5 checksum of a message +- `rosmsg package`: List all messages in a package +- `rosmsg packages`: List all packages that contain messages + +Examples: + +- `rosmsg list`: List all message types in the current ROS environment. +- `rosmsg packages`: List all packages containing message types. +- `rosmsg package `: List all messages in a specific package. +- `rosmsg show `: Display the description of a specific message. +- `rosmsg info `: Similar to `rosmsg show`, it provides information about a message type. +- `rosmsg md5 `: Generate the MD5 checksum of a message for data integrity checks. + +## 2.4.4 rosservice +`rosservice` includes command-line tools to list and query ROS services. + +- `rosservice args`: Print the arguments required by a service +- `rosservice call`: Call a service with the provided arguments +- `rosservice find`: Find services by type +- `rosservice info`: Print information about a service +- `rosservice list`: List all active services +- `rosservice type`: Print the type of a service +- `rosservice uri`: Print the ROSRPC URI of a service + +Examples: + +- `rosservice list`: List all active services. +- `rosservice args /service_name`: Print the arguments required by a specific service. +- `rosservice call /service_name "args"`: Call a service with the provided arguments. +- `rosservice find `: Find services by their message type. +- `rosservice info /service_name`: Get detailed information about a service. +- `rosservice type /service_name`: Get the type of a service. +- `rosservice uri /service_name`: Get the URI of a service. + +## 2.4.5 rossrv +`rossrv` is a command-line tool for displaying information about ROS service types. It is very similar to `rosmsg` in syntax. + +- `rossrv show`: Display the description of a service message +- `rossrv info`: Display detailed information about a service message +- `rossrv list`: List all service message types +- `rossrv md5`: Display the MD5 checksum of a service message +- `rossrv package`: List all service messages in a package +- `rossrv packages`: List all packages that contain service messages + +Examples: + +- `rossrv list`: List all service message types in the current ROS environment. +- `rossrv packages`: List all packages containing service messages. +- `rossrv package `: List all service messages in a specific package. +- `rossrv show `: Display the description of a specific service message. +- `rossrv info `: Similar to `rossrv show`, it provides information about a service message type. +- `rossrv md5 `: Generate the MD5 checksum of a service message for data integrity checks. + +## 2.4.6 rosparam +`rosparam` includes command-line tools for getting and setting ROS parameters on the parameter server, using YAML-encoded files. + +- `rosparam set`: Set a parameter +- `rosparam get`: Get a parameter +- `rosparam load`: Load parameters from an external file +- `rosparam dump`: Dump parameters to an external file +- `rosparam delete`: Delete a parameter +- `rosparam list`: List all parameters + +Examples: + +- `rosparam list`: List all parameters on the parameter server. +- `rosparam set `: Set a parameter with a specific value. +- `rosparam get `: Get the value of a specific parameter. +- `rosparam delete `: Delete a specific parameter. +- `rosparam load `: Load parameters from a YAML file. +- `rosparam dump `: Dump the current parameters to a YAML file. + +These commands allow you to interact with different components of the ROS ecosystem dynamically, providing a powerful way to manage and monitor your ROS-based robot system. \ No newline at end of file diff --git a/6-Robotics/6.1-Introduction to ROS/6.1.6-ROS Operation Management/README.md b/6-Robotics/6.1-Introduction to ROS/6.1.6-ROS Operation Management/README.md new file mode 100644 index 0000000..8c3d4a6 --- /dev/null +++ b/6-Robotics/6.1-Introduction to ROS/6.1.6-ROS Operation Management/README.md @@ -0,0 +1,868 @@ +# 6.1.6-ROS Operation Management + +## Managing ROS Nodes with Launch Files + +Launch files in ROS are XML-formatted files used to start and manage multiple ROS nodes efficiently. This section covers the various tags available in launch files, including their attributes and use cases. + +### The `` Tag + +The `` tag is the root of every launch file and acts as a container for all other tags. + +#### 1. Attributes +- **deprecated="deprecation statement"** + Indicates to the user that the current launch file has been deprecated. + +#### 2. Child Tags +- All other tags in a launch file are child elements of the `` tag. + +#### Example: +```xml + + + +``` + +### The `` Tag + +The `` tag is used to specify a ROS node to be launched. It's one of the most commonly used tags in a launch file. Note that the `roslaunch` command does not guarantee that nodes will start in the order they are declared, as the node startup process is multi-threaded. + +#### 1. Attributes +- **pkg="package_name"** + Specifies the package to which the node belongs. + +- **type="nodeType"** + The type of the node, which corresponds to the executable file name. + +- **name="nodeName"** + The name of the node within the ROS network topology. + +- **args="xxx xxx xxx"** (optional) + Passes arguments to the node. + +- **machine="machine_name"** + Specifies the machine on which the node should be launched. + +- **respawn="true | false"** (optional) + Determines whether the node should automatically restart if it exits. + +- **respawn_delay="N"** (optional) + If `respawn` is set to true, this defines a delay of N seconds before the node is restarted. + +- **required="true | false"** (optional) + Indicates whether this node is critical. If set to true, the entire `roslaunch` process will be terminated if the node exits. + +- **ns="namespace"** (optional) + Launches the node within the specified namespace. + +- **clear_params="true | false"** (optional) + Clears all parameters in the node's private namespace before the node starts. + +- **output="log | screen"** (optional) + Determines where the log output should be sent: either to a log file or the screen. The default is `log`. + +#### 2. Child Tags +- **env**: Used for setting environment variables. +- **remap**: Used for remapping topic or service names. +- **rosparam**: Used for setting parameters. +- **param**: Used for setting parameters. + +#### Example: +```xml + + + + + + +``` + +### The `` Tag + +The `` tag is used to include another XML-formatted launch file into the current launch file. This allows for modular and reusable configurations. + +#### 1. Attributes +- **file="$(find package_name)/path/to/file.launch"** + Specifies the path to the launch file to be included. + +- **ns="namespace"** (optional) + Includes the file under the specified namespace. + +#### 2. Child Tags +- **env**: Used for setting environment variables. +- **arg**: Used to pass arguments to the included launch file. + +#### Example: +```xml + + + +``` + +### The `` Tag + +The `` tag is used to remap ROS topic or service names. This is useful for avoiding name conflicts or standardizing names across different nodes. + +#### 1. Attributes +- **from="xxx"** + The original topic or service name. + +- **to="yyy"** + The new name for the topic or service. + +#### 2. Child Tags +- None + +#### Example: +```xml + + + + + +``` + +### The `` Tag + +The `` tag is used to set parameters on the ROS parameter server. The source of the parameter can be specified directly in the tag or loaded from an external file. When used inside a `` tag, the parameter is set within the node's private namespace. + +#### 1. Attributes +- **name="namespace/parameter_name"** + The name of the parameter, which can include a namespace. + +- **value="xxx"** (optional) + Defines the value of the parameter. If omitted, an external file must be specified as the parameter source. + +- **type="str | int | double | bool | yaml"** (optional) + Specifies the type of the parameter. If not specified, `roslaunch` will attempt to infer the type based on the value: + - Numbers with a `.` are parsed as floating-point (double). + - The strings "true" and "false" are parsed as boolean values (case-insensitive). + - Everything else is parsed as a string. + +#### 2. Child Tags +- None + +#### Example: +```xml + + + + + +``` + +### The `` Tag + +The `` tag allows parameters to be loaded from a YAML file, exported to a YAML file, or deleted. When used inside a `` tag, the parameters are considered private. + +#### 1. Attributes +- **command="load | dump | delete"** (optional, default is `load`) + Specifies the operation to perform: load parameters from a file, export them to a file, or delete them. + +- **file="$(find package_name)/path/to/file.yaml"** + Specifies the YAML file to load or export parameters. + +- **param="parameter_name"** + The name of the parameter. + +- **ns="namespace"** (optional) + Specifies the namespace for the parameters. + +#### 2. Child Tags +- None + +#### Example: +```xml + + + +``` + +## The `` Tag + +The `` tag is used to group nodes and other tags, and it allows for applying a namespace or other settings to the group as a whole. + +#### 1. Attributes +- **ns="namespace"** (optional) + Applies a namespace to all nodes and parameters within the group. + +- **clear_params="true | false"** (optional) + Clears all parameters in the group's namespace before the group is launched. Use with caution as this can remove critical parameters. + +#### 2. Child Tags +- Any tags except the `` tag can be children of ``. + +#### Example: +```xml + + + + + + +``` + +### The `` Tag + +The `` tag is used to define dynamic arguments that can be passed to the launch file at runtime, similar to function parameters. This increases the flexibility of launch files. + +#### 1. Attributes +- **name="argument_name"** + The name of the argument. + +- **default="default_value"** (optional) + Specifies the default value for the argument. + +- **value="value"** (optional) + Specifies the value for the argument. Cannot be used simultaneously with `default`. + +- **doc="description"** + Provides a description of the argument. + +#### 2. Child Tags +- None + +#### 3. Example +Launch file with argument syntax, `hello.launch`: + +```xml + + + + +``` + +Command-line invocation with argument passing: + +```bash +roslaunch hello.launch robot_name:=robot_value +``` +## ROS Workspace Overlay + +Imagine you have two custom workspaces, Workspace A and Workspace B, both containing a package named `turtlesim`. Additionally, the system's built-in workspace also has a package named `turtlesim`. When you invoke the `turtlesim` package, which one will be used? + +### Implementation Steps + +#### Step 0: Create Workspaces A and B +First, create two separate workspaces, A and B. Within each workspace, create a package named `turtlesim`. + +#### Step 1: Modify the `~/.bashrc` File +Add the following lines to your `~/.bashrc` file to source the setup files for both workspaces: + +```bash +source /home/user/path/to/workspaceA/devel/setup.bash +source /home/user/path/to/workspaceB/devel/setup.bash +``` + +Replace `/home/user/path/to/` with the actual paths to your workspaces. + +#### Step 2: Load Environment Variables +Open a new terminal and run the following command to load the updated environment variables: + +```bash +source ~/.bashrc +``` + +#### Step 3: Check ROS Environment Variables +To verify the ROS package paths, run: + +```bash +echo $ROS_PACKAGE_PATH +``` + +**Result:** The output will show the paths in the following order: Workspace B → Workspace A → System Built-in Workspace. + +#### Step 4: Invoke the `turtlesim` Package +Now, run the following command to navigate to the `turtlesim` package: + +```bash +roscd turtlesim +``` + +**Result:** You will be directed to the `turtlesim` package within Workspace B. + + +## Handling ROS Node Name Conflicts + +### Scenario +In ROS, each node has a name, which is defined during node initialization. In C++, this is done using the `ros::init(argc, argv, "node_name");` API, while in Python, it's done with `rospy.init_node("node_name")`. In a ROS network topology, nodes must have unique names because if multiple nodes share the same name, it can cause confusion during invocation. Specifically, if a node with a duplicate name is started, the existing node with that name will be shut down automatically. But what if you need to run multiple instances of the same node or deal with name conflicts? + +ROS provides two strategies to handle such situations: **namespaces** and **name remapping**. + +- **Namespaces** add a prefix to node names. +- **Name remapping** assigns an alias to a node name. + +Both strategies can resolve node name conflicts, and they can be implemented in several ways: + +1. Using the `rosrun` command. +2. Through launch files. +3. In the node's code. + +This section will demonstrate how to use these three methods to avoid node name conflicts. + +### Example Scenario +Let's start two `turtlesim_node` nodes. If you open two terminals and start the nodes directly without any changes, the first node will be shut down when you start the second one. You'll see a warning message: + +```plaintext +[ WARN] [1578812836.351049332]: Shutdown request received. +[ WARN] [1578812836.351207362]: Reason given for shutdown: [new node registered with same name] +``` + +Since nodes cannot share the same name, we'll explore several strategies to address this issue. + +## Using `rosrun` for Namespaces and Remapping + +### 1. Setting a Namespace with `rosrun` + +You can set a namespace for a node using the following syntax: + +```bash +rosrun package_name node_name __ns:=/new_namespace +``` + +#### Example: +```bash +rosrun turtlesim turtlesim_node __ns:=/xxx +rosrun turtlesim turtlesim_node __ns:=/yyy +``` + +With these commands, both nodes will run without issues. + +#### Results: +Use `rosnode list` to check the nodes: + +```plaintext +/xxx/turtlesim +/yyy/turtlesim +``` + +### 2. Remapping Node Names with `rosrun` + +You can also remap a node's name, effectively giving it an alias, using the following syntax: + +```bash +rosrun package_name node_name __name:=new_name +``` + +#### Example: +```bash +rosrun turtlesim turtlesim_node __name:=t1 +rosrun turtlesim turtlesim_node __name:=t2 +``` + +With these commands, both nodes will run with their new names. + +#### Results: +Use `rosnode list` to check the nodes: + +```plaintext +/t1 +/t2 +``` + +### 3. Combining Namespace and Name Remapping with `rosrun` + +You can combine both techniques, setting a namespace and remapping the node name simultaneously: + +```bash +rosrun package_name node_name __ns:=/new_namespace __name:=new_name +``` + +#### Example: +```bash +rosrun turtlesim turtlesim_node __ns:=/xxx __name:=tn +``` + +#### Results: +Use `rosnode list` to check the node: + +```plaintext +/xxx/tn +``` + +Alternatively, you can set the namespace using an environment variable before starting the node: + +```bash +export ROS_NAMESPACE=xxxx +``` + +## Using Launch Files for Namespaces and Remapping + +In launch files, the `` tag includes two important attributes: `name` and `ns`. These are used for name remapping and setting namespaces, respectively. Using a launch file to handle namespaces and name remapping is straightforward. + +### 1. Launch File Example + +Here's how you can set namespaces and name remapping in a launch file: + +```xml + + + + + +``` + +In this example, the `name` attribute is mandatory, while `ns` is optional. + +### 2. Running the Launch File + +Run the launch file and then use `rosnode list` to see the results: + +```plaintext +/t1 +/t2 +/hello/t1 +``` + +## Setting Namespaces and Remapping in Code + +If you're implementing custom nodes, you have more flexibility in setting namespaces and name remapping directly in your code. + +### 1. C++ Implementation: Name Remapping + +You can set a name alias using the following code: + +```cpp +ros::init(argc, argv, "zhangsan", ros::init_options::AnonymousName); +``` + +#### Execution: +This will append a timestamp to the node's name, ensuring it's unique. + +### 2. C++ Implementation: Setting a Namespace + +You can set a namespace directly in the code like this: + +```cpp +std::map map; +map["__ns"] = "xxxx"; +ros::init(map, "wangqiang"); +``` + +#### Execution: +This sets a namespace for the node, allowing it to run without conflicts. + +### 3. Python Implementation: Name Remapping + +In Python, you can achieve similar functionality by using the following code: + +```python +rospy.init_node("lisi", anonymous=True) +``` +--- +## Topic Name Remapping in ROS + +In ROS, topic name remapping allows you to change the name of a topic that a node subscribes to or publishes to without modifying the node's code. This is particularly useful when integrating multiple nodes that need to communicate over different topic names. There are three primary methods to remap topic names in ROS: + +1. Using the `rosrun` command. +2. Through launch files. +3. By directly modifying the code in C++ or Python. + +### Using `rosrun` to Remap Topics + +The syntax for remapping a topic name with `rosrun` is: + +```bash +rosrun package_name node_name old_topic_name:=new_topic_name +``` + +### Example: Integrating `teleop_twist_keyboard` with `turtlesim` + +There are two ways to set up communication between the `teleop_twist_keyboard` node and the `turtlesim` display node: + +#### 1. Solution 1: Remap `teleop_twist_keyboard` Topic + +In this approach, we remap the `teleop_twist_keyboard` node's topic to `/turtle1/cmd_vel`. + +- **Start the keyboard control node:** + + ```bash + rosrun teleop_twist_keyboard teleop_twist_keyboard.py /cmd_vel:=/turtle1/cmd_vel + ``` + +- **Start the turtlesim display node:** + + ```bash + rosrun turtlesim turtlesim_node + ``` + +Both nodes will communicate correctly using the `/turtle1/cmd_vel` topic. + +#### 2. Solution 2: Remap `turtlesim` Topic + +Alternatively, we can remap the `turtlesim` node's topic to `/cmd_vel`. + +- **Start the keyboard control node:** + + ```bash + rosrun teleop_twist_keyboard teleop_twist_keyboard.py + ``` + +- **Start the turtlesim display node:** + + ```bash + rosrun turtlesim turtlesim_node /turtle1/cmd_vel:=/cmd_vel + ``` + +Both nodes will communicate correctly using the `/cmd_vel` topic. + +### Using Launch Files to Remap Topics + +You can also remap topics in a launch file. The syntax for remapping a topic in a launch file is: + +```xml + + + +``` + +### Example: Integrating `teleop_twist_keyboard` with `turtlesim` Using Launch Files + +Again, there are two solutions: + +#### 1. Solution 1: Remap `teleop_twist_keyboard` Topic + +In this approach, we remap the `teleop_twist_keyboard` node's topic to `/turtle1/cmd_vel`. + +```xml + + + + + + +``` + +Both nodes will communicate correctly. + +#### 2. Solution 2: Remap `turtlesim` Topic + +In this approach, we remap the `turtlesim` node's topic to `/cmd_vel`. + +```xml + + + + + + +``` + +Both nodes will communicate correctly. + +### Remapping Topics in Code + +The topic name in ROS is influenced by the node's namespace, the node's name, and the topic's own name. Topic names can generally be categorized into three types: + +1. **Global:** The topic name is absolute and starts with a `/`, making it independent of the node's namespace. +2. **Relative:** The topic name is relative and does not start with `/`, meaning it is interpreted within the node's namespace. +3. **Private:** The topic name is private and starts with `~`, meaning it is resolved relative to the node's private namespace. + +Let's explore these concepts through examples in C++ and Python. + +### 1. C++ Implementation + +#### Example Preparation: + +1. **Initialize the node with a name:** + + ```cpp + ros::init(argc, argv, "hello"); + ``` + +2. **Set different types of topic names.** +3. **Pass a `__ns:=xxx` argument when launching the node.** +4. **After the node starts, use `rostopic` to check the topic information.** + +#### Global Topic Name + +Global topic names start with a `/` and are independent of the node's name or namespace. + +- **Example 1:** + + ```cpp + ros::Publisher pub = nh.advertise("/chatter", 1000); + ``` + + **Result:** `/chatter` + +- **Example 2:** + + ```cpp + ros::Publisher pub = nh.advertise("/chatter/money", 1000); + ``` + + **Result:** `/chatter/money` + +#### Relative Topic Name + +Relative topic names do not start with a `/` and are resolved relative to the node's namespace. + +- **Example 1:** + + ```cpp + ros::Publisher pub = nh.advertise("chatter", 1000); + ``` + + **Result:** `xxx/chatter` + +- **Example 2:** + + ```cpp + ros::Publisher pub = nh.advertise("chatter/money", 1000); + ``` + + **Result:** `xxx/chatter/money` + +#### Private Topic Name + +Private topic names start with `~` and are resolved relative to the node's private namespace. + +- **Example 1:** + + ```cpp + ros::NodeHandle nh("~"); + ros::Publisher pub = nh.advertise("chatter", 1000); + ``` + + **Result:** `/xxx/hello/chatter` + +- **Example 2:** + + ```cpp + ros::NodeHandle nh("~"); + ros::Publisher pub = nh.advertise("chatter/money", 1000); + ``` + + **Result:** `/xxx/hello/chatter/money` + +- **Special Case:** When using `~`, if the topic name starts with `/`, the topic name is treated as absolute. + + ```cpp + ros::NodeHandle nh("~"); + ros::Publisher pub = nh.advertise("/chatter/money", 1000); + ``` + + **Result:** `/chatter/money` + +### Python Implementation + +#### Example Preparation: + +1. **Initialize the node with a name:** + + ```python + rospy.init_node("hello") + ``` + +2. **Set different types of topic names.** +3. **Pass a `__ns:=xxx` argument when launching the node.** +4. **After the node starts, use `rostopic` to check the topic information.** + +#### Global Topic Name + +Global topic names start with a `/` and are independent of the node's name or namespace. + +- **Example 1:** + + ```python + pub = rospy.Publisher("/chatter", String, queue_size=1000) + ``` + + **Result:** `/chatter` + +- **Example 2:** + + ```python + pub = rospy.Publisher("/chatter/money", String, queue_size=1000) + ``` + + **Result:** `/chatter/money` + +### Relative Topic Name + +Relative topic names do not start with a `/` and are resolved relative to the node's namespace. + +- **Example 1:** + + ```python + pub = rospy.Publisher("chatter", String, queue_size=1000) + ``` + + **Result:** `xxx/chatter` + +- **Example 2:** + + ```python + pub = rospy.Publisher("chatter/money", String, queue_size=1000) + ``` + + **Result:** `xxx/chatter/money` + +#### Private Topic Name + +Private topic names start with `~` and are resolved relative to the node's private namespace. + +- **Example 1:** + + ```python + pub = rospy.Publisher("~chatter", String, queue_size=1000) + ``` + + **Result:** `/xxx/hello/chatter` + +- **Example 2:** + + ```python + pub = rospy.Publisher("~chatter/money", String, queue_size=1000) + ``` + + **Result:** `/xxx/hello/chatter/money` + +--- +## Setting Parameters in ROS + +In ROS, parameters are used to configure nodes at runtime. They can be set in various ways: using the `rosrun` command, within launch files, or directly in the code. Parameters can be global, relative, or private, depending on how they are defined. + +### Setting Parameters with `rosrun` + +You can set parameters when launching a node with the `rosrun` command. The syntax for setting parameters is: + +```bash +rosrun package_name node_name _parameter_name:=parameter_value +``` + +#### Example: Setting a Parameter for the Turtlesim Node + +Let's start the `turtlesim_node` and set a parameter `A = 100`. + +```bash +rosrun turtlesim turtlesim_node _A:=100 +``` + +#### Check the Parameters + +You can use the following command to list all parameters and check the results: + +```bash +rosparam list +``` + +**Output:** + +```plaintext +/turtlesim/A +/turtlesim/background_b +/turtlesim/background_g +/turtlesim/background_r +``` + +**Explanation:** The parameter `A` is prefixed with the node name (`/turtlesim/`), indicating that when `rosrun` is used to set a parameter, it does so in the private namespace mode. + +## Setting Parameters in Launch Files + +As previously discussed, parameters can be set in launch files using either the `` or `` tags. Parameters set outside the `` tag are global, while those set within the `` tag are private and relative to the node's namespace. + +### Example: Setting Parameters with the `` Tag + +Here’s an example where we set a global parameter and a private parameter: + +```xml + + + + + + +``` + +### Check the Parameters + +After running the launch file, you can check the parameters with: + +```bash +rosparam list +``` + +**Output:** + +```plaintext +/p1 +/t1/p2 +``` + +**Explanation:** The parameter `p1` is global, while `p2` is private to the `t1` node, as indicated by the namespace. + +## Setting Parameters in Code + +Setting parameters in code provides greater flexibility, allowing you to define global, relative, and private parameters programmatically. + +### 1. C++ Implementation + +In C++, parameters can be set using the `ros::param` API or through a `ros::NodeHandle` object. + +#### 1.1 Using `ros::param` to Set Parameters + +The `ros::param::set` function is used to set parameters. The function’s first argument is the parameter name, and the second is the parameter value. If the parameter name starts with `/`, it's a global parameter. If it starts with `~`, it's a private parameter. Otherwise, it’s a relative parameter. + +**Example:** + +```cpp +ros::param::set("/set_A", 100); // Global, independent of namespace and node name +ros::param::set("set_B", 100); // Relative, dependent on namespace +ros::param::set("~set_C", 100); // Private, dependent on namespace and node name +``` + +Assuming the namespace is `xxx` and the node name is `yyy`, checking the parameters with `rosparam list` would show: + +```plaintext +/set_A +/xxx/set_B +/xxx/yyy/set_C +``` + +#### Using `ros::NodeHandle` to Set Parameters + +To set parameters using `ros::NodeHandle`, first create a `NodeHandle` object, then call the `setParam` method. If the parameter name starts with `/`, it's global. If it doesn’t start with `/`, whether it’s relative or private depends on how the `NodeHandle` object was created. + +**Example:** + +```cpp +ros::NodeHandle nh; +nh.setParam("/nh_A", 100); // Global, independent of namespace and node name + +nh.setParam("nh_B", 100); // Relative, dependent on namespace + +ros::NodeHandle nh_private("~"); +nh_private.setParam("nh_C", 100); // Private, dependent on namespace and node name +``` + +Assuming the namespace is `xxx` and the node name is `yyy`, checking the parameters with `rosparam list` would show: + +```plaintext +/nh_A +/xxx/nh_B +/xxx/yyy/nh_C +``` + +### 2. Python Implementation + +In Python, setting parameters is slightly simpler than in C++. The `rospy.set_param` function is used to set parameters. The first argument is the parameter name, and the second is the parameter value. As with C++, if the parameter name starts with `/`, it’s global. If it starts with `~`, it’s private. Otherwise, it’s relative. + +**Example:** + +```python +rospy.set_param("/py_A", 100) # Global, independent of namespace and node name +rospy.set_param("py_B", 100) # Relative, dependent on namespace +rospy.set_param("~py_C", 100) # Private, dependent on namespace and node name +``` + +Assuming the namespace is `xxx` and the node name is `yyy`, checking the parameters with `rosparam list` would show: + +```plaintext +/py_A +/xxx/py_B +/xxx/yyy/py_C +``` + + + diff --git a/6-Robotics/6.1-Introduction to ROS/6.1.7-Common Components and Features of ROS/README.md b/6-Robotics/6.1-Introduction to ROS/6.1.7-Common Components and Features of ROS/README.md new file mode 100644 index 0000000..9bf4aff --- /dev/null +++ b/6-Robotics/6.1-Introduction to ROS/6.1.7-Common Components and Features of ROS/README.md @@ -0,0 +1,89 @@ +# 6.1.7-Common Components and Features of ROS + +## Distributed Communication in ROS + +ROS (Robot Operating System) is designed as a distributed computing environment. This means that a running ROS system can consist of multiple nodes spread across multiple machines. Depending on the configuration, any node may need to communicate with any other node at any time. + +To facilitate this, ROS has specific networking requirements: + +1. There must be full bidirectional connectivity between all machines on all ports. +2. Each machine must announce itself using a name that can be resolved by all other machines. + +### Implementation Steps + +#### 1. Preparation + +Before configuring ROS for distributed communication, ensure that the different computers are on the same network. Ideally, each computer should be assigned a static IP address. If you are using virtual machines, you need to change the network adapter setting to "Bridged Mode" to allow them to interact on the same network. + +#### 2. Modify Configuration Files + +On each computer, you need to modify the `/etc/hosts` file to include the IP addresses and hostnames of the other machines. + +- **On the Host Machine:** + + Add the IP address and hostname of the slave machine. + + ```plaintext + + ``` + +- **On the Slave Machine:** + + Add the IP address and hostname of the host machine. + + ```plaintext + + ``` + +After updating the `/etc/hosts` files, use the `ping` command to test if the machines can communicate with each other: + +- **Check IP Address:** Use `ifconfig` or `ip addr show`. +- **Check Hostname:** Use `hostname`. + +#### 3. Configure the Host Machine's IP Address + +On the host machine, you need to configure the IP address by adding the following lines to the `~/.bashrc` file: + +```bash +export ROS_MASTER_URI=http://:11311 +export ROS_HOSTNAME= +``` + +These lines set the `ROS_MASTER_URI` to the host machine's IP address, which tells ROS where the master node (`roscore`) is running. The `ROS_HOSTNAME` sets the hostname for the machine that ROS will use. + +#### 4. Configure the Slave Machine's IP Address + +On each slave machine (you can have multiple slaves), you also need to modify the `~/.bashrc` file by adding: + +```bash +export ROS_MASTER_URI=http://:11311 +export ROS_HOSTNAME= +``` + +This configuration points each slave machine to the host's ROS master, enabling them to join the ROS network. + +### Testing the Setup + +#### 1. Start `roscore` on the Host Machine + +On the host machine, start the ROS master node by running: + +```bash +roscore +``` + +This step is crucial because the ROS master node manages the communication between different nodes in the ROS network. + +#### 2. Test Communication from the Host to the Slave + +- **On the Host Machine:** Start a subscriber node. +- **On the Slave Machine:** Start a publisher node. + +Check if the nodes can communicate as expected. + +#### 3. Test Communication from the Slave to the Host + +- **On the Slave Machine:** Start a subscriber node. +- **On the Host Machine:** Start a publisher node. + +Again, verify that communication between the nodes is functioning correctly. diff --git a/6-Robotics/6.1-Introduction to ROS/6.1.8-TF Coordinate Transformation in ROS/README.md b/6-Robotics/6.1-Introduction to ROS/6.1.8-TF Coordinate Transformation in ROS/README.md new file mode 100644 index 0000000..e69de29 diff --git a/6-Robotics/6.3-Development with Physical ROS Robots/6.3.1-Get started with ROS robots/Basic motion control methods.md b/6-Robotics/6.3-Development with Physical ROS Robots/6.3.1-Get started with ROS robots/Basic motion control methods.md new file mode 100644 index 0000000..e9ae8ca --- /dev/null +++ b/6-Robotics/6.3-Development with Physical ROS Robots/6.3.1-Get started with ROS robots/Basic motion control methods.md @@ -0,0 +1,71 @@ + +# Basic motion control methods + +### Introduction + +This section will introduce how to perform simple motion control of the robot chassis using APP, PS2 controller, or keyboard. All three control methods transmit control commands via Bluetooth. + +![]() + +--- + +### Use APP to control robots + +##### Obtain Bluetooth APP + +For users with Android phones, you can download the documentation we provide to obtain the Bluetooth app. For example, for the ROS robot model XXX, the app file name is as follows *WHEELTEC_1.1.5.apk* + +For users with iPhones, you only need to search for WHEELTEC in the App Store to download and use it. + +##### Bluetooth connection to robot + +1. **Power the robot** + +Ā Ā Ā Ā Ā Ā Ā After turning on the robot's power switch, you will see the red indicator light of the Bluetooth module blinking, indicating that the Bluetooth module is inĀ anĀ unconnected state. + + Imgur + +2. **Open Bluetooth app** + + Open the Bluetooth app installed in the *"Obtain Bluetooth APP"*, and you will see an interface like the one shown below: + + Imgur + +3. **Connect Bluetooth module** + + Open the following interface by clicking the button consisting of three horizontal bars in the upper left corner, then search for the Bluetooth named BT-04A and connect. The connected pairing code is 1234. After the connection is successful, the Bluetooth module indicator light becomes solid. + + Imgur + +--- + +### APP usage instructions + +Introduction to the functions of the APP homepage: + +Imgur + +| Number | Name | Function Description | +| ------ | ---------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| ā‘  | APP Joystick | Used to control the robot's movement. (For details, refer to Chapter 3) | +| ā‘” | Control Mode | Used to switch the robot's movement control mode. | +| ā‘¢ | Angle Information Bar | Used to display the Z-axis speed information of the robot. | +| ā‘£ | Debug Bar | Used to display the real-time information received by the APP and the sent signals. | +| ⑤ | Left and Right Encoder | Used to display the speed information of motors A and B of the robot. The left and right encoders correspond to motors A and B, respectively. | +| ā‘„ | Battery Progress Bar | Used to display the robot's battery level in percentage form. For some models using lithium iron phosphate batteries, the voltage and capacity do not have a linear relationship, so 0% to 100% corresponds to a battery voltage of 20V-25.2V or 10-12.6V. | +| ⑦/ā‘§ | Speed Control Buttons | Adjust the robot's movement speed. Each click increases the speed by 0.1m/s; reducing speed is similar. | + +After connecting the Bluetooth APP to the robot's Bluetooth module, the robot's control mode will not switch immediately. The control mode of the robot will be displayed in real-time in the lower-left corner of the OLED screen on the robot. + +Once the connection is established, lightly push the control joystick forward on the main page of the APP to switch the robot's control mode to APP control mode. After switching to the APP control mode, you will be able to control the robot, view waveforms, and adjust parameters using the APP. + (The above four components are mounted on a frame to form a chassis) + + +--- + **If you want to know about related products, you can click the link below:** + + [ReComputer J1020 v2 nano. ](https://www.seeedstudio.com/reComputer-J1020-v2-p-5498.html) + + [Ros robot kit. ](https://www.aliexpress.us/item/3256801169020544.html?gatewayAdapt=glo2usa) + + diff --git a/6-Robotics/6.3-Development with Physical ROS Robots/6.3.1-Get started with ROS robots/Overview of the ROS robot system.md b/6-Robotics/6.3-Development with Physical ROS Robots/6.3.1-Get started with ROS robots/Overview of the ROS robot system.md new file mode 100644 index 0000000..5c5ef36 --- /dev/null +++ b/6-Robotics/6.3-Development with Physical ROS Robots/6.3.1-Get started with ROS robots/Overview of the ROS robot system.md @@ -0,0 +1,79 @@ +# Overview of the ROS robot system + +### Introduction + +This section will introduce the system architecture of the ROS robot and provide a list of the hardware components. The ROS robot system is primarily composed of two main parts: + +1. The upper-level component of perception and decision. +2. The lower-level component of motion control. + +--- + +### The upper-level component of perception and decision + +This part requires high computational power but has low real-time requirements. It is typically run on a general-purpose operating system on high-performance devices. Programs are written to perform perception and decision-making functions based on input data from various types of sensors. + +**The programs for the perception and decision-making functions of the ROS robot are executed by** [reComputer J1020 v2 nano. ](https://www.seeedstudio.com/reComputer-J1020-v2-p-5498.html) + +![](https://media-cdn.seeedstudio.com/media/catalog/product/cache/bb49d3ec4ee05b6f018e93f896b8a25d/1/3/13_1.jpg) +The role of the reComputer is to gather data from various sensors, process and analyze the data as needed, and then control (decide) the robot's actions, such as movement and grasping. + +For example, to enable the robot to follow a red object, the camera sensor first captures the environmental image information. The reComputer processes the image to identify the position of the red object and then directs the robot to approach the object. + +The reComputer can be considered a computer capable of running ROS. Since it needs to be installed inside the robot, the computer must be relatively compact. + +**The input data for the perception and decision-making programs are provided by various types of sensors, including :** + +* **Motor encoders** + + ![Imgur](https://i.imgur.com/ZTFAdos.png) + (Obtain the robot's motion information) +* **IMU&GNSS modules** + + ![Imgur](https://i.imgur.com/PxTweDJ.png) + (Obtain external motion information) +* **Single-line LiDAR** + +![Imgur](https://i.imgur.com/MvzQCpB.png) + +(Obtain a two-dimensional laser point cloud) + +* **Multi-line LiDAR** + ![](https://www.livoxtech.com/dps/2d9e037e6d457ef7ffec037f7d16dcf8.png) + (Obtain three-dimensional laser point cloud) + +* **Depth camras** + ![Imgur](https://i.imgur.com/Dtz1Ins.png) + (Obtain depth-informed images) + +--- + +### The lower-level component of motion control + +This part has low computational power requirements but high real-time requirements. The programs for motion control are written in C and run on a microcontroller. Their primary role is to execute the motion control commands issued by the upper-level system and provide feedback on the robot's current state. + +**The list of hardware used for motion control is as follows:** + +* **Power Battery** + ![Imgur](https://i.imgur.com/YIfOwfY.png) + + (Provide energy for ROS robots) +* **Controller and Driver** + ![Imgur](https://i.imgur.com/38YXWXR.png) + (The controller generates control signals, and the driver amplifies these signals to drive the motor. They can be designed as an integrated unit) +* **Motor and Servo** + ![Imgur](https://i.imgur.com/3oiYCap.png) + (Devices that convert electrical energy into kinetic energy) +* **Chassis** + + ![Imgur](https://i.imgur.com/3NgbrXq.png) + + (The above four components are mounted on a frame to form a chassis) +--- + **If you want to know about related products, you can click the link below:** + + [ReComputer J1020 v2 nano. ](https://www.seeedstudio.com/reComputer-J1020-v2-p-5498.html) + + [Ros robot kit. ](https://www.aliexpress.us/item/3256801169020544.html?gatewayAdapt=glo2usa) + + diff --git a/6-Robotics/6.3-Development with Physical ROS Robots/6.3.2-Common sensor uses in robots/Depth camera sensor.md b/6-Robotics/6.3-Development with Physical ROS Robots/6.3.2-Common sensor uses in robots/Depth camera sensor.md new file mode 100644 index 0000000..63f59c2 --- /dev/null +++ b/6-Robotics/6.3-Development with Physical ROS Robots/6.3.2-Common sensor uses in robots/Depth camera sensor.md @@ -0,0 +1,69 @@ +# Depth camera sensor + +### Introduction + +This chapter briefly introduces the depth camera on the ROS robot and teaches you how to get started quickly. + +Orbbec Gemini 2 is a binocular structured light 3D camera equipped with Orbbec's new MX6600 depth engine chip. It features three depth operating modes, providing high-quality depth data for a variety of application scenarios. With a wide field of view, it offers a depth measurement range from 0.15 to 10 meters, and integrates auxiliary point ranging functionality, enabling zero-blind-spot depth measurement within a maximum range of 10 meters. +![Imgur](https://i.imgur.com/qtx6uda.jpg) + +--- + +### Depth camera usage + + +#### Get source code + +```bash +git clone https://github.com/orbbec/OrbbecSDK.git +``` + +Alternatively, you can install via binary packages, please refer to [installation guidance](doc/tutorial/English/Installation_guidance.md) for more information. + +### Environment setup + +* Linux: + +If you installed via a debian package, you can skip the installation of the udev rules file. If not, please install it using the following commands: + +```bash +cd OrbbecSDK/misc/scripts +sudo chmod +x ./install_udev_rules.sh +sudo ./install_udev_rules.sh +sudo udevadm control --reload && sudo udevadm trigger +``` + + +## Examples + +The sample code is located in the `./examples` directory and can be built using CMake. + +### Build + +```bash +cd OrbbecSDK && mkdir build && cd build && cmake .. && cmake --build . --config Release +``` + +### Run example + +To connect your Orbbec camera to your PC, run the following steps: + +```bash +cd OrbbecSDK/build/bin # build output dir +./OBMultiStream # OBMultiStream.exe on Windows +``` + +The following image is the result of running MultiStream on the Gemini2 device. Other Devices run result maybe different. + +![Multistream](https://i.imgur.com/3bBEggL.png) + + +--- + **If you want to know about related products, you can click the link below:** + + [ReComputer J1020 v2 nano. ](https://www.seeedstudio.com/reComputer-J1020-v2-p-5498.html) + + [Ros robot kit. ](https://www.aliexpress.us/item/3256801169020544.html?gatewayAdapt=glo2usa) + + + diff --git a/6-Robotics/README.MD b/6-Robotics/README.MD new file mode 100644 index 0000000..520a941 --- /dev/null +++ b/6-Robotics/README.MD @@ -0,0 +1,17 @@ + +## šŸ“š Table of ROS Robotics +| **Chapter** | **Content** | +|:-----------:|:------------------------------------------------:| +| Module 6.1 | **Introduction to ROS** | +| | [Overview of ROS and Environment Setup](./6.1-Introduction%20to%20ROS/6.1.1-Overview%20of%20ROS%20and%20Environment%20Setup/README.md) | +| | [Quick Experience with HelloWorld for ROS](./6.1-Introduction%20to%20ROS/6.1.2-Quick%20Experience%20with%20HelloWorld%20for%20ROS/README.md) | +| | [ROS Architecture](./6.1-Introduction%20to%20ROS/6.1.3-ROS%20Architecture/README.md)| +| | [ROS Communication Mechanism](./6.1-Introduction%20to%20ROS/6.1.4-ROS%20Communication%20Mechanism/README.md) | +| | [Common ROS Commands](./6.1-Introduction%20to%20ROS/6.1.5-Common%20ROS%20Commands/README.md) | +| | [ROS Operation Management](./6.1-Introduction%20to%20ROS/6.1.6-ROS%20Operation%20Management/README.md) | +| | [Common Components and Features of ROS](./6.1-Introduction%20to%20ROS/6.1.7-Common%20Components%20and%20Features%20of%20ROS/README.md) | +| | [TF Coordinate Transformation in ROS](./6.1-Introduction%20to%20ROS/6.1.8-TF%20Coordinate%20Transformation%20in%20ROS/README.md) | +| Module 6.2| **ROS Robot Simulation** | +| Module 6.3| **Development with Physical ROS Robots** | +| Module 6.4| **ROS Project Practice: Advanced Features** | + diff --git a/7-Algorithm-Optimization-and-Deployment/README.md b/7-Algorithm-Optimization-and-Deployment/README.md new file mode 100644 index 0000000..e69de29 diff --git a/8-Practical-Applications-of-the-Jetson-Platform/README.md b/8-Practical-Applications-of-the-Jetson-Platform/README.md new file mode 100644 index 0000000..e69de29 diff --git a/9-Course-Summary-and-Outlook/README.md b/9-Course-Summary-and-Outlook/README.md new file mode 100644 index 0000000..e69de29 diff --git a/README.md b/README.md index 319042b..58ff76f 100644 --- a/README.md +++ b/README.md @@ -1,2 +1,74 @@ # reComputer-Jetson-for-Beginners -Beginner's Guide to reComputer Jetson + + + +Welcome to the reComputer Jetson Orin Beginner Guide! Dive deep into the NVIDIA Jetson Orin platform with this comprehensive guide designed to help developers harness Jetson Orin’s powerful AI computing capabilities. By leveraging cutting-edge technology, you will be well-equipped to innovate in AI and robotics. Join us to explore the vast potential of Jetson and set the stage for pioneering developments in the industry! + +

+ + jetson-for-beginners- banner + +

+ +## šŸ”¦ Features + +- **From Beginner to Master**: + - Start with the basics and progress to mastering advanced AI applications. + - Modules cover the Jetson Orin software stack, computer vision, video analytics, robotics, and generative AI. + + +- **Comprehensive Tool Coverage**: + - Master NVIDIA's core technologies: CUDA, JetPack SDK, TensorRT, and Deepstream. + - Utilize popular AI frameworks such as PyTorch and TensorFlow. + +- **Hands on Industry-Relevant and cutting-edge Projects**: + - Build an end-to-end single AI Network Video Recorder (NVR) system in the Computer Vision module. + - Assemble a complete Autonomous Mobile Robot (AMR) in the Robotics module. + - Deploy cutting-edge large language models like Llama 3 and Ollma to create your own chatbot. + +- **Step-by-Step Tutorials**: + - Receive clear, incremental instructions that guide you from basic programming to the development of complex AI applications on the Jetson platform. + + +## šŸ“‹ Preparation +Before beginning, ensure you have: +1. Basic knowledge of Linux commands. +2. A Jetson device—[Seeed reComputer J4012](https://www.seeedstudio.com/reComputer-J4012-p-5586.html) recommended. + +> Note: While all Nvidia Jetson Orin-based devices are suitable, ensure your device has at least 8GB of memory. + +## About reComputer Jetson Orin + +[![Watch the video](https://img.youtube.com/vi/-KAyUHzRxHc/hqdefault.jpg)](https://www.youtube.com/watch?v=-KAyUHzRxHc "Click to watch the video") + +The [reComputer Jetson Orin](https://www.seeedstudio.com/tag/nvidia.html) is a compact yet powerful intelligent edge box that delivers modern AI performance of up to 100 TOPS to the edge. It features an NVIDIA Jetson Orin module, an open-source carrier board, a heatsink, and a power adapter. Key specifications include 4x USB 3.2, HDMI, GbE, M.2 key E for WIFI, M.2 Key M for SSD, RTC, CAN, and a 40-pin connector. Preinstalled with Jetpack, reComputer simplifies development and is ideal for edge AI solution providers focusing on video analytics, object detection, natural language processing, medical imaging, and robotics in smart cities, security, and industrial automation. + +## šŸ“š Table of Contents +Explore a broad range of topics from Jetson platform basics to generative AI deployment: + +| **Chapter** | **Content** | +|:-----------:|:------------------------------------------------:| +| **Module 1**| **Introduction** (incomplete)| +| **Module 2**| [**reComputer Jetson Platform Overview**](./2-reComputer-Jetson-Platform-Overview/README.md)| +| **Module 3**| [**Basic Tools and Getting Started**](./3-Basic-Tools-and-Getting-Started/README.MD)| +| **Module 4**| [**Computer Vision Applications**](./4-Computer-Vision/README.md)| +| **Module 5**| [**Generative AI Applications**](./5-Generative-AI/README.md)| +| **Module 6** | [**ROS Robotics**](./6-Robotics/README.MD)| +| **Module 7**| **Algorithm Optimization and Deployment** (incomplete)| +| **Module 8**| **Practical Applications of the Jetson Platform** (incomplete)| +| **Module 9**| **Course Summary and Outlook** (incomplete)| + + + +## šŸ“œ License +This project is licensed under the [MIT License](https://github.com/Seeed-Projects/reComputer-Jetson-for-Beginners/blob/main/LICENSE). + + + +## šŸ”— Reference +- [AI for Beginners - Microsoft](https://github.com/microsoft/AI-For-Beginners) +- [Seeed Projects - Jetson Examples](https://github.com/Seeed-Projects/jetson-examples) +- [jetson-containers](https://github.com/dusty-nv/jetson-containers) +- [jetson-inference](https://github.com/dusty-nv/jetson-inference) diff --git a/Table-of-Contents.md b/Table-of-Contents.md new file mode 100644 index 0000000..4465358 --- /dev/null +++ b/Table-of-Contents.md @@ -0,0 +1,42 @@ + +## šŸ“š Table of Contents +Explore a broad range of topics from Jetson platform basics to generative AI deployment: + +| **Chapter** | **Content** | +|:-----------:|:------------------------------------------------:| +| **Module 1**| **Introduction** | +| **Module 2**| **reComputer Jetson Platform Overview** | +| **Module 3**| **Basic Tools and Getting Started** | +| Module 3.1 | [Python and Programming Fundamentals](./3-Basic-Tools-and-Getting-Started/3.1-Python-and-Programming-Fundamentals/README.md) | +| Module 3.2 | [AI and ML](./3-Basic-Tools-and-Getting-Started/3.2-AI-and-ML/README.md) | +| Module 3.3 | [Pytorch and TensorFlow](./3-Basic-Tools-and-Getting-Started/3.3-Pytorch-and-Tensorflow/README.md) | +| Module 3.4 | [CUDA](./3-Basic-Tools-and-Getting-Started/3.4-CUDA/README.md) | +| Module 3.5 | [TensorRT](./3-Basic-Tools-and-Getting-Started/3.5-TensorRT/README.md) | +| Module 3.6 | [Docker](./3-Basic-Tools-and-Getting-Started/3.6-Docker/README.md) | +| Module 3.7 | [ROS1/ROS2](./3-Basic-Tools-and-Getting-Started/3.7-ROS/README.md) | +| Module 3.8 | [Opencv with CUDA](./3-Basic-Tools-and-Getting-Started/3.8-OpenCV-with-CUDA/README.md) | +| **Module 4**| **Computer Vision Applications** | +| Module 4.1| [Overview-of-Computer-Vision](./4-Computer-Vision/4.1-Overview-of-Computer-Vision/README.md)| +| Module 4.2| [Real-time-Video-Processing](./4-Computer-Vision/4.2-Real-time-Video-Processing/README.md)| +| **Module 4.3**| **Object Detection and Recognition**| +| Module 4.3.1| [Train and Deploy YOLOv8](./4-Computer-Vision/4.3-Object%20Detection%20and%20Recognition/4.3.1-Train%20and%20Deploy%20YOLOv8%20on%20reComputer/README.md)| +| Module 4.3.2| [Deploy YOLOv8 using TensorRT and DeepStream SDK Support](./4-Computer-Vision/4.3-Object%20Detection%20and%20Recognition/4.3.2-Deploy%20YOLOv8%20on%20NVIDIA%20Jetson%20using%20TensorRT%20and%20DeepStream%20SDK%20Support/README.md)| +| **Module 4.4**| [**Project Practice-Intelligent Surveillance System**](./4-Computer-Vision/4.4-Project%20Practice-Intelligent%20Surveillance%20System/README.md)| +| **Module 5**| **Generative AI Applications** | +| **Module 6** | **ROS Robotics** | +| Module 6.1 | Introduction to ROS | +| Module 6.1.1 | [Overview of ROS and Environment Setup](./6-Robotics/6.1-Introduction%20to%20ROS/6.1.1-Overview%20of%20ROS%20and%20Environment%20Setup/README.md) | +| Module 6.1.2 | [Quick Experience with HelloWorld for ROS](./6-Robotics/6.1-Introduction%20to%20ROS/6.1.2-Quick%20Experience%20with%20HelloWorld%20for%20ROS/README.md) | +| Module 6.1.3 | [ROS Architecture](./6-Robotics/6.1-Introduction%20to%20ROS/6.1.3-ROS%20Architecture/README.md)| +| Module 6.1.4 | [ROS Communication Mechanism](./6-Robotics/6.1-Introduction%20to%20ROS/6.1.4-ROS%20Communication%20Mechanism/README.md) | +| Module 6.1.5 | [Common ROS Commands](./6-Robotics/6.1-Introduction%20to%20ROS/6.1.5-Common%20ROS%20Commands/README.md) | +| Module 6.1.6 | [ROS Operation Management](./6-Robotics/6.1-Introduction%20to%20ROS/6.1.6-ROS%20Operation%20Management/README.md) | +| Module 6.1.7 | [Common Components and Features of ROS](./6-Robotics/6.1-Introduction%20to%20ROS/6.1.7-Common%20Components%20and%20Features%20of%20ROS/README.md) | +| Module 6.1.8 | [TF Coordinate Transformation in ROS](./6-Robotics/6.1-Introduction%20to%20ROS/6.1.8-TF%20Coordinate%20Transformation%20in%20ROS/README.md) | +| Module 6.2| ROS Robot Simulation | +| Module 6.3| Development with Physical ROS Robots | +| Module 6.4| ROS Project Practice: Advanced Features | +| **Module 7**| **Algorithm Optimization and Deployment** | +| **Module 8**| **Practical Applications of the Jetson Platform** | +| **Module 9**| **Course Summary and Outlook** | +