From 5bdc0c1f8c57259a2d525377ee084222d4fb362c Mon Sep 17 00:00:00 2001 From: Charliechen114514 <725610365@qq.com> Date: Sat, 13 Jun 2026 10:45:17 +0800 Subject: [PATCH 1/3] feat: sync the english version --- README.md | 2 +- .../01-from-loops-to-iterators.md | 419 +++++++++++++ ...02-stl-algorithms-and-iterator-pitfalls.md | 419 +++++++++++++ .../03-ranges-views-and-composition.md | 437 +++++++++++++ .../2025/03-back-to-basics-ranges/index.md | 39 ++ .../01-copy-cost-and-motivation.md | 453 ++++++++++++++ .../02-lvalue-rvalue-and-references.md | 419 +++++++++++++ .../03-move-ops-stdmove-and-elision.md | 591 ++++++++++++++++++ .../04-back-to-basics-move-semantics/index.md | 43 ++ .../cppcon/2025/index.md | 29 + .../01-from-loops-to-iterators.md | 409 ++++++++++++ ...02-stl-algorithms-and-iterator-pitfalls.md | 408 ++++++++++++ .../03-ranges-views-and-composition.md | 428 +++++++++++++ .../2025/03-back-to-basics-ranges/index.md | 32 + .../01-copy-cost-and-motivation.md | 443 +++++++++++++ .../02-lvalue-rvalue-and-references.md | 409 ++++++++++++ .../03-move-ops-stdmove-and-elision.md | 581 +++++++++++++++++ .../04-back-to-basics-move-semantics/index.md | 34 + .../cppcon/2025/index.md | 29 + 19 files changed, 5623 insertions(+), 1 deletion(-) create mode 100644 documents/en/vol10-open-lecture-notes/cppcon/2025/03-back-to-basics-ranges/01-from-loops-to-iterators.md create mode 100644 documents/en/vol10-open-lecture-notes/cppcon/2025/03-back-to-basics-ranges/02-stl-algorithms-and-iterator-pitfalls.md create mode 100644 documents/en/vol10-open-lecture-notes/cppcon/2025/03-back-to-basics-ranges/03-ranges-views-and-composition.md create mode 100644 documents/en/vol10-open-lecture-notes/cppcon/2025/03-back-to-basics-ranges/index.md create mode 100644 documents/en/vol10-open-lecture-notes/cppcon/2025/04-back-to-basics-move-semantics/01-copy-cost-and-motivation.md create mode 100644 documents/en/vol10-open-lecture-notes/cppcon/2025/04-back-to-basics-move-semantics/02-lvalue-rvalue-and-references.md create mode 100644 documents/en/vol10-open-lecture-notes/cppcon/2025/04-back-to-basics-move-semantics/03-move-ops-stdmove-and-elision.md create mode 100644 documents/en/vol10-open-lecture-notes/cppcon/2025/04-back-to-basics-move-semantics/index.md create mode 100644 documents/vol10-open-lecture-notes/cppcon/2025/03-back-to-basics-ranges/01-from-loops-to-iterators.md create mode 100644 documents/vol10-open-lecture-notes/cppcon/2025/03-back-to-basics-ranges/02-stl-algorithms-and-iterator-pitfalls.md create mode 100644 documents/vol10-open-lecture-notes/cppcon/2025/03-back-to-basics-ranges/03-ranges-views-and-composition.md create mode 100644 documents/vol10-open-lecture-notes/cppcon/2025/03-back-to-basics-ranges/index.md create mode 100644 documents/vol10-open-lecture-notes/cppcon/2025/04-back-to-basics-move-semantics/01-copy-cost-and-motivation.md create mode 100644 documents/vol10-open-lecture-notes/cppcon/2025/04-back-to-basics-move-semantics/02-lvalue-rvalue-and-references.md create mode 100644 documents/vol10-open-lecture-notes/cppcon/2025/04-back-to-basics-move-semantics/03-move-ops-stdmove-and-elision.md create mode 100644 documents/vol10-open-lecture-notes/cppcon/2025/04-back-to-basics-move-semantics/index.md diff --git a/README.md b/README.md index 1df31e2cd..2c2f476ab 100644 --- a/README.md +++ b/README.md @@ -19,7 +19,7 @@ --- -![English Coverage](https://img.shields.io/badge/en_coverage-99%25-green.svg) 420/423 docs translated +![English Coverage](https://img.shields.io/badge/en_coverage-99%25-green.svg) 428/431 docs translated ## 这是什么项目 diff --git a/documents/en/vol10-open-lecture-notes/cppcon/2025/03-back-to-basics-ranges/01-from-loops-to-iterators.md b/documents/en/vol10-open-lecture-notes/cppcon/2025/03-back-to-basics-ranges/01-from-loops-to-iterators.md new file mode 100644 index 000000000..83b033606 --- /dev/null +++ b/documents/en/vol10-open-lecture-notes/cppcon/2025/03-back-to-basics-ranges/01-from-loops-to-iterators.md @@ -0,0 +1,419 @@ +--- +title: 'From Loops to Iterators: The Path to Abstracting Data Traversal' +description: 'CppCon 2025 Talk Notes — Mike Shah: From for Loops and Pointer Traversal + to Iterator Abstractions, Completing the Iterator Category Hierarchy, and Benchmarking + Legacy Tags vs. C++20 Concepts Using GCC 16.1.1' +conference: cppcon +conference_year: 2025 +talk_title: 'Back to Basics: C++ Ranges' +speaker: Mike Shah +video_youtube: https://www.youtube.com/watch?v=Q434UHWRzI0 +tags: +- cpp-modern +- host +- beginner +- Ranges +- 容器 +difficulty: beginner +platform: host +cpp_standard: +- 11 +- 17 +- 20 +chapter: 3 +order: 1 +translation: + source: documents/vol10-open-lecture-notes/cppcon/2025/03-back-to-basics-ranges/01-from-loops-to-iterators.md + source_hash: 91550fa1d1d266e526d5c6e4b17b99b311f751048bd984193688b9b984dc07bf + translated_at: '2026-06-13T02:12:46.363648+00:00' + engine: anthropic + token_count: 4006 +--- +# From Loops to Iterators: The Abstraction Path of Data Traversal + +:::tip +This article is an in-depth adaptation of Mike Shah's "Back to Basics: C++ Ranges" from CppCon 2025. The YouTube link is above. This series is planned in three parts: this part clarifies the "data traversal" thread (loops → pointers → iterators → range-based for), the second part covers STL algorithms and iterator pitfalls, and the third part officially dives into Ranges, Views, and pipeline composition. The experimental environment is Arch Linux WSL, GCC 16.1.1, with compiler flag `-std=c++20`. +::: + +Mike Shah opened his talk with a very plain statement that I found increasingly reasonable the more I thought about it: **an algorithm is essentially a loop**. He mentioned reading a 2012 paper on empirical performance evaluation of algorithms during his graduate studies, which inspired the realization that when facing an unfamiliar codebase and wanting to figure out "where the computation actually happens," the fastest way is to look for the loops. Because as engineers, half of our work is **transforming data**, and the other half is **storing data**, and loops are the most direct vehicle for "transforming data." + +:::warning Take Shah's statement with a grain of salt +"Algorithm = loop" is a "gross oversimplification" that he himself repeatedly emphasized. Just get the gist of it. Strictly speaking, an algorithm is a finite sequence of steps to solve a problem—recursive algorithms, parallel algorithms (``), and coroutine-based algorithms don't necessarily look like `for`. Loops are just one of the most common vehicles. But as an entry point for understanding STL and Ranges, this simplification works well: **first understand loops, then see how STL abstracts loops away.** +::: + +In this article, we start from the most primitive index-based loop and see step by step how C++ abstracts "data traversal" layer by layer. Our destination isn't Ranges (that's part three), but **iterators**—the bridge connecting "loops" and "algorithms." + +Let's lay out the experimental environment first; all subsequent output is based on it: + +```bash +❯ g++ --version +g++ (GCC) 16.1.1 20260430 + +❯ uname -sr +Linux 6.18.33.1-microsoft-standard-WSL2 +``` + +## The Most Basic Traversal: Index-Based for Loops + +Everything starts here. Suppose we have a string of characters to print one by one. Most people would subconsciously write the three-part `for`: + +```cpp +#include +#include + +int main() +{ + std::array message{'H', 'e', 'l', 'l', 'o'}; + + for (std::size_t i = 0; i < message.size(); ++i) { + std::cout << message[i]; + } + std::cout << '\n'; +} +``` + +This code actually hides two implicit assumptions that we use so habitually we never think about them. First, it assumes the container supports `operator[]` index-based access; second, it assumes the container knows its own `size()`. `std::array`, `std::vector`, and `std::string` all satisfy these two conditions, so it runs fine. But switch to `std::list` or `std::set`—which don't have index-based access—and this code won't compile. The same "traversal" logic needs to be rewritten for a different container, which is exactly the sign of insufficient abstraction. + +But let's not rush to abstract. Whether index-based loops should be used, and when, is a nuanced question, but it's not the focus here. What we care about is: **it expresses "traversal," but it tightly couples traversal with "the container happens to use contiguous storage and happens to support indexing."** We want to extract the former on its own. + +## A Different Perspective: Traversal with Pointers + +Shah showed an alternative approach on his slides, and I was momentarily surprised—this works too? Instead of using indices, he gets the starting address of the array and walks through it with pointers: + +```cpp +char* begin = message.data(); +char* end = message.data() + message.size(); +for (char* p = begin; p != end; ++p) { + std::cout << *p; +} +``` + +Here, `data()` returns the address of the first element of the underlying array, and `end` is the starting address plus the number of elements—pointer arithmetic. Then inside the loop body, `*p` dereferences and `++p` advances one step. The output is exactly the same as the index-based version, but the perspective is completely different: **we no longer rely on the "index" abstraction, but directly manipulate "addresses."** + +Why switch to this perspective? Shah's motivation is straightforward—**generalization**. Indexing assumes "contiguous storage + random access," but in reality, many data structures aren't contiguous: linked lists, trees, graphs. How do you `tree[i]` a binary tree? You can't index it with an integer. But "starting from some point and stepping to the next element" is the common core of all data structure traversals. Pointer `++` is just the simplest implementation of "stepping to the next." + +:::tip A brief note on the origins of STL +Abstracting "incrementing a pointer" into a replaceable object was the work of Alexander Stepanov and Meng Lee at HP Labs in the 1990s—this was the prototype of STL, submitted to the committee in 1993–94, and later incorporated into the C++98 standard. Iterators were born from the very beginning to "decouple algorithms from data structures," not added as an afterthought. +::: + +## Iterators: Generalizing Pointers + +Since "stepping to the next element" can have different implementations, we might as well abstract it into a type—this is the **iterator**. The first sentence about iterators on cppreference is: **"Iterators are a generalization of pointers"**. + +We use the `std::begin` and `std::end` free function pair to get the begin and end iterators of a container: + +```cpp +for (auto it = std::begin(message); it != std::end(message); ++it) { + std::cout << *it; +} +``` + +See? The code looks almost identical to the pointer version—`begin`, `end`, `!=`, `++`, `*`. The only difference is that the type of `it` is no longer `char*`, but an object that "behaves like a pointer." Switch to `std::list` or `std::set`, and this code runs without changing a single word (as long as their iterators support these operations). Abstraction starts paying us back here. + +There are two details worth pausing on. The first is that `begin()` points to the first element, while `end()` points to **one past the last element** (one-past-the-end), and it cannot be dereferenced itself. This half-open interval `[begin, end)` convention wasn't chosen arbitrarily: **it makes checking for an "empty container" extremely natural**—an empty container is simply `begin == end`, the loop condition is directly false, and no special case is needed. If `end` pointed to the last element itself, then an empty container wouldn't have a "last element," making it awkward to handle. + +The second detail is the difference between the **free function** form of `std::begin` / `std::end` and the **member function** form of `.begin()` / `.end()` on containers. + +:::warning Shah wasn't quite accurate here +In his talk, Shah said "only some containers have `.begin()` and `.end()`, but not all containers do, so free functions are more universal"—this statement is actually **inaccurate**. The fact is: **all STL containers have `.begin()` / `.end()` member functions**, without exception. + +The true value of the free functions `std::begin` / `std::end` lies in three things: first, they provide overloads for **raw arrays** (like `int arr[5]`)—arrays don't have member functions, so you can only get begin/end pointers through free functions; second, they make **generic code** more uniform (no need to distinguish between "this is a container or an array" in templates); third, C++20's `std::ranges::begin` can also handle sentinels and proxy types (like `vector`). So a more accurate statement would be: **free functions are more uniform for built-in arrays and custom types, not "some containers lack member functions."** +::: + +## The Iterator Category Hierarchy: Not All Iterators Are Created Equal + +At this point, Shah said in his talk, "I won't go into iterator categories," and skipped it. But this is exactly where beginners stumble the most. Since this article is an in-depth adaptation, let's fill in that gap—this is the **main event** of this article. + +Not all iterators have the same capabilities. An iterator of `std::vector` can `it + 5` to jump five positions at once, but an iterator of `std::list` can't—it can only `++` step by step. The standard divides iterators into several **categories** by capability, from weakest to strongest roughly: input → forward → bidirectional → random access → contiguous (added in C++20). + +The key question is: **how do you know which category a given iterator belongs to?** Before C++20, it relied on a type trait called `std::iterator_traits::iterator_category` (a tag type); after C++20, it changed to a set of **concepts**, such as `std::random_access_iterator` and `std::contiguous_iterator`. These two systems coexist in C++20, but they can give **different** answers for the same iterator—behind this lies a very important evolution. + +I wrote a small program using GCC 16.1.1 to print both sets of results for common containers: + +```cpp +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +// 旧的 C++98 风格:从 iterator_traits 取 tag +template +const char* legacy_tag() +{ + using cat = typename std::iterator_traits::iterator_category; + if constexpr (std::is_same_v) return "contiguous"; + else if constexpr (std::is_same_v) return "random_access"; + else if constexpr (std::is_same_v) return "bidirectional"; + else if constexpr (std::is_same_v) return "forward"; + else if constexpr (std::is_same_v) return "input"; + else return "?"; +} + +// 新的 C++20 风格:用 concept 探测 +template +const char* cpp20_concept() +{ + if constexpr (std::contiguous_iterator) return "contiguous_iterator"; + else if constexpr (std::random_access_iterator) return "random_access_iterator"; + else if constexpr (std::bidirectional_iterator) return "bidirectional_iterator"; + else if constexpr (std::forward_iterator) return "forward_iterator"; + else if constexpr (std::input_iterator) return "input_iterator"; + else return "(none)"; +} + +template +void row(const char* name) +{ + std::printf("%-26s legacy_category=%-15s cpp20_concept=%s\n", + name, legacy_tag(), cpp20_concept()); +} + +int main() +{ + row::iterator>("std::array"); + row::iterator>("std::vector"); + row("std::string"); + row::iterator>("std::deque"); + row::iterator>("std::list"); + row::iterator>("std::forward_list"); + row::iterator>("std::set"); + row::iterator>("std::map"); + row("int* (raw pointer)"); + + static_assert(std::contiguous_iterator); + static_assert(std::random_access_iterator::iterator>); + static_assert(!std::contiguous_iterator::iterator>); + static_assert(!std::random_access_iterator::iterator>); + std::printf("static_assert checks: PASS\n"); +} +``` + +Compile and run: + +```bash +❯ g++ -std=c++20 -O2 -Wall iter.cpp -o iter && ./iter +std::array legacy_category=random_access cpp20_concept=contiguous_iterator +std::vector legacy_category=random_access cpp20_concept=contiguous_iterator +std::string legacy_category=random_access cpp20_concept=contiguous_iterator +std::deque legacy_category=random_access cpp20_concept=random_access_iterator +std::list legacy_category=bidirectional cpp20_concept=bidirectional_iterator +std::forward_list legacy_category=forward cpp20_concept=forward_iterator +std::set legacy_category=bidirectional cpp20_concept=bidirectional_iterator +std::map legacy_category=bidirectional cpp20_concept=bidirectional_iterator +int* (raw pointer) legacy_category=random_access cpp20_concept=contiguous_iterator +static_assert checks: PASS +``` + +See the pattern? **The most interesting parts are the first few lines and the last line.** `std::array`, `std::vector`, `std::string`, and the raw pointer `int*`—their old tags are all `random_access`, but the C++20 concept probe reveals them as `contiguous_iterator`. + +This is the problem: **the old tag system simply didn't have a `contiguous` (contiguous) tier** (`contiguous_iterator_tag` was only added in C++20). Before C++20, the `iterator_category` of `int*` could only be tagged as `random_access`, with no way to express the stronger property that "this memory is not only randomly accessible but also physically contiguous." Why does this distinction matter? Because "contiguous storage" means you can safely treat the underlying data of the iterator as a contiguous block of memory and feed it to a C interface (like `memcpy`, CUDA kernels, or SIMD instructions)—whereas `std::deque` also supports `it + 5`, but its internal storage is chunked and **not contiguous**, so its concept is `random_access_iterator` rather than `contiguous`. + +:::tip This is where concepts outshine tags +The old tags form an inheritance chain (`random_access_iterator_tag` inherits from `bidirectional_iterator_tag` inherits from...), with limited expressive power that can only layer. C++20 concepts are a set of **orthogonal, composable constraints** that can precisely express that "randomly accessible" and "contiguously stored" are two independently satisfiable properties. This is also why the entire Ranges system had to wait for C++20 concepts to land before entering the standard—without concepts, many constraints simply couldn't be expressed. For a more systematic explanation of concepts, see the relevant articles in vol4, and we'll also use them when we cover Ranges in part three. +::: + +## Iterator Arithmetic and std::advance + +With the category concept in mind, iterator arithmetic operations become clear. For random access iterators, you can directly `it + 5`, `it - 2`, and `it1 - it2` (compute distance), all in O(1). But for bidirectional or forward iterators, `it + 5` simply won't compile—they only understand `++` and `--`. + +So if I'm writing generic code and want to "advance n steps" without restricting the iterator category, what do I do? The standard library provides `std::advance`: + +```cpp +auto it = std::begin(message); +auto last = std::end(message); +std::ptrdiff_t available = std::distance(it, last); +if (5 < available) { + std::advance(it, 5); // 安全:确认走得到 +} +``` + +The beauty of `std::advance` is that it **automatically selects the implementation** based on the iterator category: pass it a `vector::iterator`, and it uses `it + n` (O(1)); pass it a `list::iterator`, and it degrades to n calls of `++` (O(n)). The same calling interface, but different algorithmic complexity behind the scenes—this is the sweet spot of generic programming. + +:::warning advance doesn't do bounds checking +But one thing must be noted: **`std::advance` doesn't check bounds on its own**. If you tell it to advance 100 steps but the container only has five elements, it won't raise an error—it'll just go out of bounds, and dereferencing it means a segfault (UB). That's why in the code above, I first used `std::distance` to calculate the remaining length and made a check. In practice, if you want iterators with bounds checking, GCC/Clang can add the `-D_GLIBCXX_DEBUG` compile macro, which makes standard library iterators carry bounds detection in debug mode—we'll use it to catch a real out-of-bounds bug in the next article. The MSVC equivalent is `_ITERATOR_DEBUG_LEVEL=2`. +::: + +## range-based for: Syntactic Sugar for Loops + +After all this talk about iterators, let's return to everyday coding—most of the time, we don't hand-write `for (auto it = begin; it != end; ++it)`, but instead use the **range-based for loop** from C++11: + +```cpp +for (char c : message) { + std::cout << c; +} +``` + +Clean, hard to get wrong, no need to worry about `end`. But what's really behind this syntactic sugar? It's actually an equivalent rewrite of the hand-written iterator loop above. Per the standard, it's roughly equivalent to: + +```cpp +{ + auto&& __range = message; + auto __begin = std::begin(__range); // 或 __range.begin() + auto __end = std::end(__range); // 或 __range.end() + for (; __begin != __end; ++__begin) { + char c = *__begin; + std::cout << c; // 你的循环体 + } +} +``` + +This explains a common confusion: **how does range-based for know to call `begin`/`end`?** The answer is that the compiler inserts these two calls for you behind the scenes. It first takes `__range`, then gets the begin and end iterators, and then it's just a normal iterator loop. So range-based for has no additional requirements on iterator categories—as long as your type can provide `begin`/`end` (member or free functions both work), it can be used. This is also why, later on, our custom types only need to implement these two functions to directly work with range-based for. + +If you're traversing a key-value container like `std::map`, C++17's **structured binding** combined with range-based for is extremely handy: + +```cpp +const std::map scores{ + {"alice", 90}, {"bob", 85} +}; + +for (const auto& [name, score] : scores) { + std::cout << name << ": " << score << '\n'; +} +``` + +:::warning Adding a version number for structured binding +Shah used structured binding in his talk but **didn't mark which standard it belongs to**—let's fill that in: **structured binding was introduced in C++17 (proposal P0217)**. If your project is still on C++14, this code won't compile. + +Also, Shah mentioned that "ellipsis syntax can further unpack," but this description is actually a bit vague. Structured binding itself doesn't support variadic unpacking (the number of elements it binds is fixed and must match the number of members in the right-hand type); ellipses in C++ belong to the context of template parameter pack expansion and fold expressions, which are not the same thing as structured binding. I'd suggest treating that remark as a slip of the tongue and not reading too much into it. +::: + +## Experiment: Do range-based for and Hand-Written Loops Compile to the Same Thing? + +Whenever I tell people "range-based for is just syntactic sugar," some are skeptical—do those `__range`, `__begin`, and `__end` temporary variables slow things down? Let's test it. I wrote the same "sum" operation in four different styles: + +```cpp +#include + +int sum_index(const std::vector& v) +{ + int s = 0; + for (std::size_t i = 0; i < v.size(); ++i) s += v[i]; + return s; +} + +int sum_ptr(const std::vector& v) +{ + int s = 0; + for (const int* p = v.data(), *e = p + v.size(); p != e; ++p) s += *p; + return s; +} + +int sum_iter(const std::vector& v) +{ + int s = 0; + for (auto it = v.begin(), e = v.end(); it != e; ++it) s += *it; + return s; +} + +int sum_rangefor(const std::vector& v) +{ + int s = 0; + for (int x : v) s += x; + return s; +} +``` + +Then I turned on `-O2` to have the compiler generate assembly: + +```bash +❯ g++ -std=c++20 -O2 -S codegen.cpp -o codegen.s +``` + +If you dig into the `.s` file and look at the hot loops of these four functions, you'll find they all uniformly look like this (using `sum_rangefor` as an example): + +```asm +.L19: + addl (%rax), %edx ; s += *p + addq $4, %rax ; p++ (int 占 4 字节) + cmpq %rcx, %rax ; p == e ? + jne .L19 ; 不等就继续 +``` + +The loop bodies generated by all four styles are **nearly identical at the byte level**—at `-O2`, the compiler reduces all those temporary variables, index calculations, and pointer arithmetic to the same `add / cmp / jne`. In other words, **range-based for has zero additional overhead when optimization is enabled**, so you can confidently use it for readability. The cost only appears at `-O0` (no optimization): those `__begin`/`__end` temporaries dutifully exist on the stack, but who pursues performance at `-O0` anyway? + +:::tip A small pitfall fixed in C++17 +By the way, a brief note on the history of range-based for itself: it entered the standard in C++11 (proposal N2930). But the C++11 version's expansion rules had a flaw—it would re-evaluate `__end` on every loop iteration (or rather, the caching strategy for `.end()` was unfriendly to certain proxy types). C++17 (proposal P0184) specifically fixed this, making `__end` evaluated only once at the start of the loop. So the range-based for you use today is the C++17 revised version, which is more robust. This also reminds us: use the newest standard you can, as many "syntactic sugars" have been quietly polished in subsequent versions. +::: + +## A Pair of Iterators Is a Range + +At this point, we can draw a complete line for "traversal": **a begin iterator `begin`, plus an end marker `end`, stepping through with `++`**—this pair of iterators defines a traversable piece of data. The standard library calls this "pair of iterators" a **range**. + +Why is this concept important? Because it completely decouples "where the data is" from "how to process the data." If I write a sum function that accepts a pair of iterators, it works for `vector`, `list`, `set`, or even a hand-rolled linked list—as long as these containers can provide iterators that meet the requirements. Algorithms are no longer tied to a specific container type. + +And the iterator abstraction itself is actually a classic design pattern—the **Iterator pattern**, a behavioral pattern from GoF's *Design Patterns*. Its core idea is "providing a way to access the elements of an aggregate object sequentially without exposing its underlying representation." C++ made it a language-level facility (the conventions of `begin`/`end`/`operator++`/`operator*`), so that any type following this convention can plug into the entire STL algorithm ecosystem. + +This definition of "a pair of iterators is a range" is precisely the predecessor of the `std::ranges::range` concept we'll cover in part three. The difference is that C++20's range concept allows `end` to return a sentinel of a **different type from `begin`**—this unlocks some interesting capabilities (for example, when traversing a C string ending with `'\0'`, you don't need to calculate the length first). We'll save that for part three. + +## What We've Clarified So Far + +Starting from the most primitive index-based `for`, we saw how "traversal" was abstracted step by step: index-based loops tightly coupled traversal with "contiguous storage + random access"; pointer traversal liberated it to the "address" level; iterators further abstracted it into "an object that can `++` and `*`," thereby decoupling algorithms from data structures. We also filled in the iterator category hierarchy that Shah skipped, and used GCC 16.1.1 to empirically verify a key fact: **the old tags broadly label `vector`/`string`/raw pointers as `random_access`, while C++20 concepts can precisely state that they're actually the stronger `contiguous_iterator`**—this is exactly why concepts outshine tags, and why Ranges had to wait for C++20 to land. + +The core takeaway is one sentence: **a pair of iterators (one `begin`, one `end`) defines a range, and STL algorithms are built on top of this pair of iterators.** + +In the next article, we'll hand this pair of iterators to STL algorithms—seeing how `std::sort`, `std::partition`, and `std::transform` work as "loop replacements," and what hard requirements they have on iterator categories (for example, why `std::sort` can't be used on `std::list`). There are also a few classic iterator pitfalls waiting for us there: iterator invalidation, mismatched `begin`/`end`, and reversed argument order. If you want to review container memory layouts first, vol3's [span: A View That Doesn't Own Data](../../../vol3-standard-library/02-span.md) and the container-related articles are excellent prerequisite reading. + + + + + + + + + + diff --git a/documents/en/vol10-open-lecture-notes/cppcon/2025/03-back-to-basics-ranges/02-stl-algorithms-and-iterator-pitfalls.md b/documents/en/vol10-open-lecture-notes/cppcon/2025/03-back-to-basics-ranges/02-stl-algorithms-and-iterator-pitfalls.md new file mode 100644 index 000000000..bf61ea94b --- /dev/null +++ b/documents/en/vol10-open-lecture-notes/cppcon/2025/03-back-to-basics-ranges/02-stl-algorithms-and-iterator-pitfalls.md @@ -0,0 +1,419 @@ +--- +title: STL Algorithms in Practice and Iterator Pitfalls +description: 'CppCon 2025 Talk Notes — Mike Shah: STL Algorithm Family in Practice, + Hard Constraints on Iterator Categories, with an Algorithm Cheat Sheet and Invalidation + Rules Table, Using GCC to Test Silent UB from Iterator Invalidation and Capture + with _GLIBCXX_DEBUG' +conference: cppcon +conference_year: 2025 +talk_title: 'Back to Basics: C++ Ranges' +speaker: Mike Shah +video_youtube: https://www.youtube.com/watch?v=Q434UHWRzI0 +tags: +- cpp-modern +- host +- beginner +- Ranges +- 容器 +difficulty: beginner +platform: host +cpp_standard: +- 11 +- 17 +- 20 +chapter: 3 +order: 2 +translation: + source: documents/vol10-open-lecture-notes/cppcon/2025/03-back-to-basics-ranges/02-stl-algorithms-and-iterator-pitfalls.md + source_hash: 1cae51e3ffb476ae69dfde3639fb5f16f98e7c907185b4cb8216ef73fb3a213d + translated_at: '2026-06-13T02:13:47.465561+00:00' + engine: anthropic + token_count: 3893 +--- +# STL Algorithms in Practice and Iterator Pitfalls + +:::tip +This is the second article in the CppCon 2025 Mike Shah "Back to Basics: C++ Ranges" series. In the previous article, we abstracted "traversal" from index-based loops all the way down to iterators, concluding that: **a pair of `begin`/`end` iterators defines a range**. In this article, we feed that iterator pair to STL algorithms—seeing how they write loops for us, and what hard requirements they impose on iterators. We will also dissect several classic iterator pitfalls, all tested live with GCC 16.1.1. The environment is the same as before: Arch Linux WSL, `-std=c++20`. +::: + +At the end of the previous article, we said that algorithms are built on top of that iterator pair. To make this concrete, we first need to understand what pieces the STL is actually composed of. + +## The Three Pillars of the STL + +The design philosophy of the Standard Template Library (STL) decouples three things: **containers** are responsible for storing data, **iterators** are responsible for traversing data, and **algorithms** are responsible for processing data. The three are connected through iterators as the "glue"—algorithms don't know any specific container directly, they only recognize iterators; as long as a container can produce iterators that meet the requirements, it can be reused by all algorithms. This decoupling is the fundamental reason why the STL can use a single `std::sort` to handle `vector`, `array`, and `deque`. + +So, which header files actually contain the algorithms? + +:::warning Shah's "two headers" is a bit too narrow +In his talk, Shah says "algorithms are mainly in the `` and `` headers"—this is fine for a beginner's understanding, but it actually **misses several pieces**. The full picture looks like this: general algorithms (`sort`, `find`, `copy`, `transform`, etc.) are in ``; numeric algorithms (`accumulate`, `reduce`, `inner_product`, etc.) are in ``; **parallel algorithms** (like `sort(std::execution::par, ...)` with execution policies) require `` (C++17); C++20 ranges algorithms and views are in ``; and there are even scattered ones—`std::midpoint` is in ``, but C++23's fold algorithms `std::fold_left` are in ``. So don't memorize "algorithms = two headers"; it's more accurate to remember "algorithms are spread across several headers, with `` as the main one." +::: + +## Algorithm Cheat Sheet: By Category and Required Iterator Type + +There are over a hundred STL algorithms, and memorizing them is pointless. A better way to remember them is to **group them by category**, and to keep in mind the **hard requirements each category places on iterator types**—because this directly determines whether you can use a given algorithm on a particular container. The following table is a key creative addition of this article; Shah didn't expand on it in his talk: + +| Category | Representative Algorithms | Required Iterator Category | +|------|------|------| +| Read-only search | `find` / `find_if` / `count` / `accumulate` | input (weakest acceptable) | +| Modifying copy | `copy` / `transform` / `replace` / `fill` | forward / output | +| Partitioning | `partition` / `stable_partition` | forward (stable version requires bidirectional) | +| Sorting | `sort` / `stable_sort` / `partial_sort` | **random_access** (hard requirement) | +| Binary search | `lower_bound` / `upper_bound` / `binary_search` | forward (**and the range must already be sorted**) | +| Numeric reduction | `reduce` / `transform_reduce` / `inner_product` | input | +| Heap operations | `push_heap` / `pop_heap` / `sort_heap` | random_access | + +The single most important thing to remember here is: **sorting algorithms require random access iterators**. This means they can only be used on contiguous or random-access containers like `vector`, `array`, and `deque`. **Using them on `std::list` simply won't compile**. This isn't a suggestion; it's a hard constraint. Let's test this. + +## Experiment: std::sort Cannot Be Used on std::list + +`std::list` provides bidirectional iterators, which don't support `it + n` or subtracting two iterators. Meanwhile, `std::sort` internally requires random access (it needs to do `__last - __first` to estimate recursion depth). What happens if we feed it a list's iterators? + +```cpp +#include +#include + +int main() +{ + std::list l{3, 1, 2}; + std::sort(l.begin(), l.end()); // 编不过! +} +``` + +GCC 16.1.1 error output (key lines extracted): + +```bash +❯ g++ -std=c++20 list_sort.cpp -o list_sort +/usr/include/c++/16.1.1/bits/stl_algo.h:1914:50: error: no match for ‘operator-’ + (operand types are ‘std::_List_iterator’ and ‘std::_List_iterator’) + 1914 | std::__lg(__last - __first) * 2, + | ~~~~~~~^~~~~~~~~ +``` + +See that—the error occurs right at the `__last - __first` step: `std::sort` wants to use iterator subtraction to calculate the range length, but `_List_iterator` simply doesn't define `operator-` (bidirectional iterators only understand `++`/`--`, not subtraction). This is the classic manifestation of "iterator category doesn't satisfy algorithm requirements." If you really need to sort a `list`, use its member function `l.sort()`—that's a merge sort tailored for linked lists with O(n log n) complexity, but it doesn't rely on random access. + +## sort, partition, copy, transform: What Common Algorithms Look Like + +Let's quickly run through the most commonly used algorithms to build intuition. Their parameter shapes are remarkably consistent—the vast majority take **a pair of iterators `(first, last)` plus an optional predicate or destination**. + +```cpp +#include +#include +#include +#include + +void demo(std::vector& v, const std::vector& src) +{ + // 排序整个区间 + std::sort(v.begin(), v.end()); + + // 局部排序:只排 [begin, begin+3),后面元素顺序不定但都 >= 前 3 个 + // std::partial_sort(v.begin(), v.begin() + 3, v.end()); + + // 分区:把满足谓词的元素挪到前面,返回分界点 + auto it = std::partition(v.begin(), v.end(), [](int x) { return x < 4; }); + + // 拷贝:用 back_inserter 自动 push_back,不用预先算大小 + std::copy(src.begin(), src.end(), std::back_inserter(v)); + + // 打乱:必须传一个随机数引擎(C++11 起 rand() 不推荐) + std::shuffle(v.begin(), v.end(), std::mt19937{std::random_device{}()}); +} +``` + +Two details here are worth elaborating on. `std::back_inserter(v)` returns an **output iterator**; as you write to it, it automatically calls `v.push_back()`—this avoids the hassle of "needing to know how many elements to copy and reserving space in advance," making it the most common partner for `copy`. `std::shuffle` reminds us: **after C++11, random numbers should use the engines from the `` header (like `std::mt19937`), not the old `rand()`**—`rand()` has poor quality and thread-safety issues. + +Now look at `std::transform`, which encapsulates the "apply a function to each element" pattern. Note the use of `cbegin`/`cend` here—**const versions of the iterators**, indicating "I only read from the source range, I don't modify it": + +```cpp +#include +#include +#include + +std::string s = "hello"; +std::string out; +std::transform(s.cbegin(), s.cend(), std::back_inserter(out), + [](char c) { return std::toupper(static_cast(c)); }); +// out == "HELLO" +``` + +`cbegin`/`cend` return `const_iterator`, while `rbegin`/`rend` return reverse iterators. An easy pitfall: **these iterators must be used in pairs**—you can't pair `cbegin()` with `end()` (one is const, the other isn't; the types don't match). After C++20, the status of `const_iterator` in the standard library was elevated further (proposals like P0896), because the ranges system relies heavily on it. + +## rotate: Parameter Order Is the Biggest Pitfall + +`std::rotate` is a very useful but particularly easy-to-get-wrong algorithm. Its job is to "cyclically shift elements in a range so that the element pointed to by `middle` becomes the new first element." The signature takes three iterators: `std::rotate(first, middle, last)`. + +```cpp +std::vector v{1, 2, 3, 4, 5}; +std::rotate(v.begin(), v.begin() + 2, v.end()); +// 结果:{3, 4, 5, 1, 2} —— middle(begin+2,即 3) 变成了新首元素 +``` + +Actual output: + +```bash +❯ g++ -std=c++20 rot_ok.cpp -o rot_ok && ./rot_ok +rotate(begin, begin+2, end) on {1,2,3,4,5} -> { 3 4 5 1 2 } +``` + +The trap here is: **the vast majority of algorithms take two iterators `(first, last)`, but `rotate` alone (along with `partial_sort`, `nth_element`, etc.) takes three `(first, middle, last)`**. Once you develop muscle memory for "two parameters," it's extremely easy to swap the positions of `middle` and `last` when writing `rotate`. Shah himself complained about this—he used `upper_bound` to find an insertion point and then `rotate` to manually implement insertion sort, calling it "too clever, ugly." + +So what happens if you get the order wrong? I swapped `middle` and `last`, writing it as `rotate(first, last, middle)`: + +```cpp +std::vector w{1, 2, 3, 4, 5}; +std::rotate(w.begin(), w.end(), w.begin() + 2); // 参数顺序错了 +``` + +```bash +❯ g++ -std=c++20 rot_bad.cpp -o rot_bad && ./rot_bad +about to call rotate(begin, end, begin+2)... +[程序崩溃,退出码 139 — SIGSEGV] +``` + +Immediate segfault (exit code 139 = SIGSEGV). The reason is straightforward: `std::rotate` requires both `[first, middle)` and `[middle, last)` to be valid sub-ranges; in other words, the three iterators must satisfy the `first <= middle <= last` ordering. After writing it as `(first, last, middle)`, the second sub-range `[middle_arg=last, last_arg=middle)` becomes an invalid range (the end is before the start), and the algorithm dereferences an out-of-bounds position and crashes. + +:::warning For three-iterator algorithms, always check the documentation for parameter order +Algorithms like `rotate`, `partial_sort`, `nth_element`, and `stable_partition` don't take simple `(first, last)` parameters, but rather three-segment forms like `(first, middle, last)`. Before using them, you must confirm what `middle` actually refers to. This will improve in the ranges versions we cover in part three—because ranges versions often require fewer parameters (passing the container directly), reducing the chance of pairing errors. +::: + +## How Many Algorithms Are There Really? The "Over 200" Claim Needs an Asterisk + +In his talk, Shah mentions a widely circulated number: "A 2018 CppCon talk said there are at least 105 algorithms, and now there are over 200." Is this accurate? Let's fact-check this. + +First, the origin of the "105" figure: it comes from Jonathan Boccara's CppCon 2018 talk, "105 STL Algorithms in Less Than an Hour". That used a **very loose counting criteria**—it counted `_if` variants (`find` / `find_if`), `_n` variants (`copy` / `copy_n`), and `_copy` variants (`remove` / `remove_copy`) as separate algorithms, for the purpose of making the talk easier to follow and present. + +So what's the strict number? I checked against cppreference, and as of C++23: + +- The `` header contains approximately **91** `std::` function templates (not counting ranges versions). +- The `` header contains **14** numeric algorithms (`accumulate`, `reduce`, `inner_product`, etc.; C++26 will add 5 more saturated arithmetic ones, bringing it to 19). +- The `std::ranges::` namespace contains approximately **100** "constrained algorithms" (niebloids, which are the ranges versions of algorithms). +- Additionally, there are about 14 uninitialized memory algorithms in ``. + +So the "over 200" claim **only holds true if you count both the `std::` and `std::ranges::` APIs as separate entries, plus various variant overloads**. If you count by "unique algorithm names," the actual number is approximately **110 to 120**. + +:::tip How to phrase it accurately +Rather than saying "the STL has over 200 algorithms," a more rigorous statement is: **the STL has over 100 unique algorithms; if you count both the `std::` and `std::ranges::` interfaces as entries, there are indeed over 200 API entry points.** This distinction is quite important in interviews or technical writing—"over 200" sounds impressive, but a large portion of that consists of variants and ranges mirrors of the same algorithm. +::: + +## Pitfall 1: Iterator Invalidation—The Most Insidious Killer + +Once you're familiar with the algorithms themselves, they aren't hard to use. What really trips people up is **coordinating the lifecycles of iterators and containers**. The number one pitfall is **iterator invalidation**. + +Consider this code that looks perfectly innocent: + +```cpp +std::vector v{1, 2, 3}; +auto it = v.begin(); // it 指向 v 的第一个元素 +v.push_back(4); // 如果触发扩容,it 就悬空了! +std::cout << *it << '\n'; // 解引用悬空迭代器 —— UB +``` + +The problem lies in `push_back`. Internally, `vector` is a contiguous dynamic array; when capacity is insufficient, it **reallocates a larger block of memory**, moves the old elements over, and then frees the old memory. But your `it` still points to that **now-freed old memory**—it becomes a dangling pointer (the standard term is "singular iterator"). Dereferencing `*it` at this point is undefined behavior (UB). + +The scary part is: **UB doesn't necessarily crash immediately**. It often manifests as "reading a seemingly normal value," so you think everything is fine, merge the code into main, and then one day it inexplicably crashes on a customer's machine. Let's test this with a normal compilation (no debug flags): + +```cpp +#include +#include +int main() +{ + std::vector v{1, 2, 3}; + auto it = v.begin(); + std::cout << "before push_back: *it=" << *it << ", cap=" << v.capacity() << "\n"; + v.push_back(4); v.push_back(5); v.push_back(6); v.push_back(7); // 必然扩容 + std::cout << "after push_back: cap=" << v.capacity() << "\n"; + std::cout << "deref stale it: " << *it << "\n"; // UB:读已释放内存 +} +``` + +```bash +❯ g++ -std=c++20 -O0 inval.cpp -o inval && ./inval; echo "退出码=$?" +before push_back: *it=1, cap=3 +after push_back: cap=12 +deref stale it: -40771459 +退出码=0 +``` + +See that—the program **exits normally (exit code 0) with no errors**, but the value read out is garbage like `-40771459`. After `vector` expands, the capacity jumps from 3 to 12, the old memory is freed, and the memory `it` points to contains random residual data. This is UB at its most insidious: **silent errors**. + +So how do you catch it? GCC/Clang provide a debug macro, `-D_GLIBCXX_DEBUG`. When enabled, standard library iterators carry bounds and validity checks; the moment you dereference an invalidated iterator, it immediately aborts and prints diagnostics. Let's compile the same code with debug mode enabled: + +```bash +❯ g++ -std=c++20 -O0 -g -D_GLIBCXX_DEBUG inval.cpp -o inval_dbg && ./inval_dbg; echo "退出码=$?" +before push_back: *it=1, cap=3 +after push_back: cap=12 +/usr/include/c++/16.1.1/debug/safe_iterator.h:352: +Error: attempt to dereference a singular iterator. +Objects involved in the operation: + iterator "this" @ 0x7fff6bd63820 { + type = gnu_cxx::normal_iterator>(mutable iterator); + state = singular; ← 迭代器已失效 + references sequence with type 'std::debug::vector' @ 0x7fff6bd63850 + } +退出码=134 ← 134 = SIGABRT,被调试库主动 abort +``` + +Caught red-handed this time: `state = singular` explicitly tells you the iterator is invalid, and `attempt to dereference a singular iterator` precisely identifies what you did. A single `-D_GLIBCXX_DEBUG` macro turns "silent UB" into "instant crash + precise location"—enable it during development, disable it for release (it has a performance cost). The MSVC equivalent switch is `_ITERATOR_DEBUG_LEVEL=2`; Release configurations default to 0 or 1, while Debug configurations use 2. + +:::tip Iterator invalidation rules cheat sheet (verified against cppreference) +Invalidation rules vary significantly between containers; just remember the general principles and look up the specifics: + +- **`vector` / `string`**: `push_back` invalidates **all** iterators only when it triggers a reallocation (capacity change); when no reallocation occurs, only `end()` changes. After `reserve`, as long as you don't exceed the reserved capacity, iterators won't invalidate. +- **`deque`**: Insertions at either end invalidate **all iterators** (even without reallocation), but **references and pointers do not invalidate**—so be careful when traversing a deque; storing references is safer than storing iterators. +- **`list` / `forward_list`**: Insertions and `splice` **do not invalidate** any existing iterators (linked list nodes don't move); only the iterator corresponding to the erased node is invalidated. +- **`unordered_*`**: `rehash` (triggered when insertion causes the bucket count to change) invalidates **iterators, but references and pointers do not invalidate**. + +Remember one overarching principle: **whenever a container might "move house" internally (contiguous storage containers reallocating, hash tables rehashing), iterators may invalidate; node-based containers (list, tree nodes) don't move, so their iterators are stable.** +::: + +## Pitfall 2: Mismatched Iterator Pairs—begin and end Must Come from the Same Object + +The second pitfall relates to "pairing." Algorithms require `first` and `last` to come from **the same container**, but C++ can't enforce this at runtime—if you pass iterators from two different containers, the compiler accepts them without complaint, and the result is UB. + +The classic crash scenario comes from Jason Turner's C++ Weekly (which Shah specifically referenced in his talk): a function returns a temporary `vector`, and to save trouble, you chain `.begin()` and `.end()` calls directly: + +```cpp +std::vector download_data(); // 每次调用返回一个全新的临时 vector + +// 危险写法: +// process(download_data().begin(), download_data().end()); +``` + +:::warning Shah understates this here +Shah's commentary on this code is "maybe it works sometimes, maybe we get lucky"—this statement **could mislead beginners** because it implies "there are legitimate cases where this works." **There aren't.** This is undefined behavior; there is no "legitimately working" path, only the illusion of "UB accidentally behaving normally." + +The reason: the two `download_data()` calls are **two independent function calls**, returning **two different temporary `vector` objects**. Their `.begin()` and `.end()` point to two completely unrelated memory blocks. Pairing one temporary's `begin` with another temporary's `end` and feeding them to an algorithm—the range isn't valid at all. Worse, both temporaries are destroyed at the end of that statement, so the iterators the algorithm holds are dangling from the start. **The correct approach is to first store the result in a named variable**, so that `begin` and `end` come from the same living object: + +```cpp +auto data = download_data(); // 一个具名变量,一份内存 +process(data.begin(), data.end()); // begin/end 来自同一个 data —— 安全 +``` + +This illusion of "same function name means same object" is a high-frequency area for pairing errors. +::: + +## Pitfall 3: Insufficient Space—Cramming Too Much into a Fixed-Size Destination + +The third pitfall relates to output destinations. When you use `std::copy` to write data to a **fixed-size** destination (like a raw array, or a container without a prior `back_inserter`), and the source range is larger than the destination space, you get an **out-of-bounds write**—again UB, and it can silently corrupt adjacent memory. + +```cpp +int src[10] = {0,1,2,3,4,5,6,7,8,9}; +int dst[3]; // 只有 3 个位置! +std::copy(std::begin(src), std::end(src), std::begin(dst)); // 越界写 —— UB +``` + +This code compiles, runs, and doesn't immediately report errors, but you've written 7 values that shouldn't be there into the memory after `dst`. This kind of bug can be caught with AddressSanitizer (`-fsanitize=address`), which will report a heap/stack buffer overflow. + +The workaround is straightforward: either use `std::back_inserter` (letting the destination container grow automatically), or `reserve` sufficient space before copying and confirm the source range doesn't exceed the destination capacity. Circling back to our first lesson: **letting the container manage its own size (using an inserter) is much safer than manually calculating sizes.** + +## Error Quality: Are Ranges Really More Friendly? + +In his summary, Shah says "Ranges use concepts and give you better error messages." This is true, but with a caveat. Let's compare the errors from both interfaces when "passing the wrong parameters." + +First, the classic `std::sort` with wrong parameters—pairing `begin` from a `vector` with `end` from a `list` (type mismatch): + +```cpp +std::vector v{1,2,3}; +std::list l{4,5,6}; +std::sort(v.begin(), l.end()); // 两个不同容器的迭代器 +``` + +Now the ranges version with wrong parameters—passing something that isn't a range at all to `std::ranges::sort`: + +```cpp +int not_a_range = 42; +std::ranges::sort(not_a_range); +``` + +Error line counts from both under GCC 16.1.1: + +```bash +❯ # 经典版 +❯ g++ -std=c++20 err_classic.cpp 2>err_c.txt; wc -l < err_c.txt +32 +❯ head -3 err_c.txt +err_classic.cpp:7:14: error: no matching function for call to + 'sort(std::vector::iterator, std::__cxx11::list::iterator)' + +❯ # ranges 版 +❯ g++ -std=c++20 err_ranges.cpp 2>err_r.txt; wc -l < err_r.txt +69 +``` + +Here's the interesting part—**in this specific example, the ranges version's error (69 lines) is actually longer than the classic version (32 lines)**. This is because passing a `int` to `ranges::sort` forces the compiler to unfold the entire concept constraint chain (`sortable` → `random_access_iterator` → ...) for you to see; the longer the chain, the more verbose the error. So I have to honestly correct a common impression: **"ranges errors are always shorter and friendlier" doesn't hold up**. Their readability depends heavily on compiler version and specific scenario (GCC 10+ / Clang 12+ are more mature; older compilers still spit out a screenful of template gibberish). + +So what's the real advantage of ranges when it comes to "errors"? It's not the line count, but **that it prevents you from writing certain bugs in the first place**. Recall pitfall two from above—the classic `std::sort` accepts two iterators, so you can easily mismatch `begin`/`end` from two different containers (like in `err_classic`), and the compiler only errors at instantiation time. But `std::ranges::sort` **accepts only one container**, so you can't even express the error of "begin from A, end from B." **Having one fewer opportunity to make a mistake is far more practical than friendlier error messages.** This is the core safety benefit of ranges, which we'll expand on in part three. + +## Transition: Must Iterators Die? + +At this point in the talk, Shah put up a rather exaggerated slide—"Iterators must die." Exaggeration aside, the sentiment he wanted to express is real: **while the iterator interface is powerful, it's full of pitfalls**—pairing is error-prone, parameter order (for three-iterator algorithms) is easy to get backwards, and partial sort syntax is ugly. + +The good news is that C++20 Ranges directly addresses these pain points. It doesn't abandon iterators (iterators remain the underlying mechanism, and even C++26 can't do without them), but it wraps a safer, more composable interface layer on top of iterators: **passing containers directly instead of iterator pairs, using concepts to intercept type errors early at compile time, and using views for lazy composition**. These are the main threads of part three. + +In the next article, we'll formally dive into Ranges—starting from "why `ranges::sort` takes one fewer parameter," moving through lazy evaluation of views, the pipe operator, and `ranges::to`, and finally a feature that will make your eyes light up: **infinite ranges**. If you're interested in parallel versions of numeric algorithms (`reduce`, `transform_reduce`), you can check out the content on `` execution policies and `std::reduce` parallel reduction in the vol5 concurrency volume—that's where algorithms and concurrency intersect. + + + + + + + + + + + diff --git a/documents/en/vol10-open-lecture-notes/cppcon/2025/03-back-to-basics-ranges/03-ranges-views-and-composition.md b/documents/en/vol10-open-lecture-notes/cppcon/2025/03-back-to-basics-ranges/03-ranges-views-and-composition.md new file mode 100644 index 000000000..299d49ed5 --- /dev/null +++ b/documents/en/vol10-open-lecture-notes/cppcon/2025/03-back-to-basics-ranges/03-ranges-views-and-composition.md @@ -0,0 +1,437 @@ +--- +title: 'Ranges, Views, and Pipeline Composition: The Power of Lazy Evaluation' +description: 'CppCon 2025 Talk Notes — Mike Shah: Constrained algorithms, view lazy + evaluation, pipe operator, ranges::to, plus eager vs. lazy benchmark comparisons, + infinite ranges, and a views version attribution table (C++20/23/26)' +conference: cppcon +conference_year: 2025 +talk_title: 'Back to Basics: C++ Ranges' +speaker: Mike Shah +video_youtube: https://www.youtube.com/watch?v=Q434UHWRzI0 +tags: +- cpp-modern +- host +- intermediate +- Ranges +difficulty: intermediate +platform: host +cpp_standard: +- 20 +- 23 +chapter: 3 +order: 3 +translation: + source: documents/vol10-open-lecture-notes/cppcon/2025/03-back-to-basics-ranges/03-ranges-views-and-composition.md + source_hash: 1a4d9d53040a5421050b3d5066124dd1f0ad9a30db9364f11086d6e1d865fb90 + translated_at: '2026-06-13T02:14:47.820594+00:00' + engine: anthropic + token_count: 3928 +--- +# Ranges, Views, and Pipeline Composition: The Power of Lazy Evaluation + +:::tip +This is the finale of Mike Shah's "Back to Basics: C++ Ranges" series at CppCon 2025. In the first two parts, we traced the path from "loops → iterators → algorithms" and dissected the classic iterator pitfalls (invalidation, mismatching pairs, and argument order). In this part, we dive into the core of Ranges: constrained algorithms, lazy evaluation of views, pipeline composition, and materializing results back into containers with `ranges::to`. This part is experiment-heavy and spans both C++20 and C++23, so the compiler flags will switch between `-std=c++20` and `-std=c++23` — a detail that is itself a foreshadowing of this article's themes. Environment: Arch Linux WSL, GCC 16.1.1. +::: + +At the end of the previous part, Shah closed with an exaggerated slide declaring "iterators must go." In this part, we'll see how Ranges redesigns a safer, more composable interface layer on top of iterators. Let's start with the most fundamental question: **what exactly did Ranges change?** + +## A range is still that pair of iterators, but the end can be a "sentinel" + +The underlying definition hasn't changed — a range is still bounded by a beginning and an end. But C++20 gave it an important extension: **the end can be a different type from the beginning, called a sentinel**. + +Why allow different types? Consider a classic example: iterating over a C-style string terminated by `'\0'`. In the traditional iterator model, you have to `strlen` calculate the length first before you can determine `end` — but you really just need to "keep going until you hit `'\0'`." A sentinel expresses an end condition of "walk until some condition is met." Its type can differ from the iterator, as long as they are comparable (`it == sentinel`). This makes iterating over "sequences of unknown length" natural — and this is precisely the foundation that makes "infinite ranges" possible later on. + +## From range-v3 to Standard Ranges: concepts are the missing piece + +Ranges didn't just appear out of nowhere in C++20. Its prototype was Eric Niebler's **range-v3** library, which was available as early as the C++14 era. If your project is still stuck on C++14/17, you can use range-v3 to practice — its API is highly similar to the Standard Library Ranges, making future migration costs very low. + +So why did the standard library version wait until C++20? **Because Ranges relies heavily on concepts for its implementation**. Ranges needs to precisely express constraints like "what counts as a range" or "what qualifies as a random-access iterator." Before concepts, these constraints could only be implemented via SFINAE (Substitution Failure Is Not An Error) — resulting in error messages that routinely spanned dozens of lines of template gibberish, making them completely unreadable. Concepts allow constraints to be named and evaluated early, and that was the final missing piece that allowed Ranges to enter the standard. + +## Constrained algorithms: one fewer parameter, one fewer chance for error + +The most immediately noticeable improvement in Ranges is **constrained algorithms** — the official name on cppreference. They share the same names as classic algorithms, but reside in the `std::ranges::` namespace. The difference is: **classic algorithms require you to pass an iterator pair `(first, last)`, while the ranges version only requires you to pass a container (or any range)**. + +```cpp +#include +#include +#include + +std::vector v{3, 1, 4, 1, 5, 9}; + +std::sort(v.begin(), v.end()); // 经典:传一对迭代器 +std::ranges::sort(v); // ranges:传整个容器 +``` + +`ranges::sort(v)` does exactly the same thing as `sort(v.begin(), v.end())`, but it takes two fewer parameters. The benefit isn't just less typing — returning to pitfall #2 from the previous part, "mismatching begin/end," **classic algorithms allow you to accidentally pair iterators from two different containers, while the ranges version doesn't even give you that opportunity**, because it only accepts a single object. Eliminating one possible error is a tangible safety improvement. + +Constrained algorithms also support span, custom containers, or anything that satisfies the `std::ranges::range` concept: + +```cpp +int arr[] = {3, 1, 4}; +std::ranges::sort(arr); // 原生数组也行 + +std::ranges::find_if(v, [](int i) { return i > 4; }); +// ranges::find_if 同样返回迭代器(指向找到的元素), +// 用 ranges::end(v) 判断是否没找到 +``` + +:::tip Iterator knowledge is not obsolete +Note that `ranges::find_if` still returns an iterator — **which means all the iterator knowledge from the previous part is still useful**. Iterator invalidation and pairing issues still exist in ranges; Ranges just makes them harder to trigger (not eliminated, just harder). We will still need iterators in C++26. +::: + +## Views: lazy evaluation, the soul of Ranges + +Constrained algorithms are just the appetizer. The real killer feature of Ranges is **views**. A view is a **lazy** way to access a range — it doesn't copy data or precompute results. Instead, as you iterate over it, it **processes one element at a time**. + +Let's compare the two styles. `std::ranges::sort(v)` is **eager evaluation** — it immediately sorts the entire range in place and only returns after finishing. In contrast, `std::views::filter(...)` is **lazy evaluation** — it simply sets up a "filtering pipeline" without doing any computation, and only yields each element to you as you actually iterate over it, but only if it meets the condition. + +```cpp +#include +#include +#include + +std::vector v{1, 2, 3, 4, 5, 6}; + +// 搭管道:此时 filter 一个元素都没处理 +auto gt3 = v | std::views::filter([](int x) { return x > 3; }); + +// 遍历时才真正执行过滤 +for (int x : gt3) { + std::cout << x << ' '; // 4 5 6 +} +``` + +That `|` is the **pipe operator**, borrowed from Unix pipes — it feeds the range on the left into the view adaptor (range adaptor) on the right. You can chain multiple views together, composing them like a pipeline: + +```cpp +auto result = v + | std::views::filter([](int x) { return x > 1; }) // 过滤 + | std::views::transform([](int x) { return x * x; }) // 变换 + | std::views::take(3); // 只取前 3 个 +// 遍历 result 时:3²=9, ... 一路惰性求值 +``` + +## Experiment: eager vs lazy, what's the actual difference? + +Simply saying "lazy is more efficient" isn't intuitive enough, so let's run a benchmark. We'll create a `vector` with ten million elements and compare two approaches: **eager** — first use `ranges::to` to materialize the filtered results into a temporary `vector`, then iterate to sum them up; **lazy** — directly iterate over `views::filter` without building a temporary container. + +```cpp +#include +#include +#include +#include +#include +#include + +int main() +{ + constexpr int N = 10'000'000; + std::vector v(N); + std::iota(v.begin(), v.end(), 0); + const auto pred = [](int x) { return x > N / 2; }; + + // EAGER:物化过滤结果到一个临时 vector,再求和 + long long se = 0; + auto t0 = std::chrono::high_resolution_clock::now(); + { + auto tmp = v | std::views::filter(pred) | std::ranges::to>(); + for (int x : tmp) se += x; + } + auto t1 = std::chrono::high_resolution_clock::now(); + + // LAZY:直接遍历 view,不建临时容器 + long long sl = 0; + auto t2 = std::chrono::high_resolution_clock::now(); + for (int x : v | std::views::filter(pred)) sl += x; + auto t3 = std::chrono::high_resolution_clock::now(); + + auto ms_e = std::chrono::duration_cast(t1 - t0).count(); + auto ms_l = std::chrono::duration_cast(t3 - t2).count(); + std::cout << "sum eager=" << se << " lazy=" << sl << "\n"; + std::cout << "eager (ranges::to 临时 + 求和): " << ms_e << " ms\n"; + std::cout << "lazy (直接遍历 view): " << ms_l << " ms\n"; +} +``` + +GCC 16.1.1, `-std=c++23 -O2`: + +```bash +❯ g++ -std=c++23 -O2 -Wall bench.cpp -o bench && ./bench +sum eager=37499992500000 lazy=37499992500000 +eager (ranges::to 临时 + 求和): 23 ms +lazy (直接遍历 view): 7 ms +``` + +Both approaches compute the exact same sum (`37499992500000`, verification passed), but **eager took 23ms while lazy only took 7ms — over 3 times faster**, and the lazy version **didn't allocate that temporary `vector` with millions of elements**. The eager approach is slower for two reasons: first, it has to copy five million matching elements into a temporary vector (a bunch of `push_back` plus potential reallocations), and second, it requires an extra complete traversal (materialize first, then sum, effectively traversing twice). The lazy approach traverses only once, filtering and summing simultaneously — filtered-out elements are simply skipped, with no copying whatsoever. + +:::tip How to see "laziness" with your own eyes +To intuitively feel that "the pipeline is set up but not executed, and execution only happens during iteration," there's a simple trick: add a `std::cout` inside the lambdas for both filter and transform, then **just set up the pipeline without iterating** — you'll find that nothing gets printed. Once you write `for (auto x : pipeline)`, each element will **traverse the entire pipeline before the next one is processed**: the first element goes through filter, and only if it passes does it enter transform, then take... It's one element going all the way through, not filtering all elements first and then transforming them. This is the lazy execution model, and it's also the reason why "short-circuiting" works later. +::: + +## Infinite ranges: magic enabled by laziness + +Lazy evaluation unlocks a very cool capability — **infinite ranges**. If evaluation were eager, infinite sequences would be impossible to express (you can't precompute an infinite number of elements). But with laziness, as long as you don't actually try to iterate over "infinity," it can exist. + +`std::views::iota(x)` starting from `x` generates an **infinitely incrementing** sequence. Paired with `take` to truncate it, it can be used safely: + +```cpp +// 生成 0², 1², 2², ... 的前 5 个 +for (int x : std::views::iota(0) + | std::views::transform([](int n) { return n * n; }) + | std::views::take(5)) { + std::cout << x << ' '; +} +``` + +```bash +❯ g++ -std=c++23 -O2 iota.cpp -o iota && ./iota +0 1 4 9 16 +``` + +`iota(0)` by itself is infinite (0, 1, 2, 3, ...), but `take(5)` truncates it to five elements. Lazy evaluation guarantees that the infinite portion beyond `take` **will never be evaluated**. This pattern of "defining an infinite source, then using a view to limit how much is used" is very handy when dealing with streaming data or generating sequences. `iota` is a range factory available since C++20. + +## Pipeline short-circuiting: efficiency brought by lazy evaluation + +Another direct benefit of laziness is **short-circuiting**. When you chain multiple filters together, as long as an element is filtered out at one stage, **the subsequent stages will not process it at all** — because the execution model is "one element goes all the way through." + +The example Shah gave was filtering a collection of strings: first filter for "starts with M," then filter for "length greater than 4." If a string doesn't start with M, it gets blocked at the first filter, and the predicate for the second filter **is never even called**. Let's quantify this effect — we'll add a counter to the filter's predicate and compare the number of predicate calls between a "full traversal" and "early termination with `take(5)`": + +```cpp +long long calls_all = 0, calls_take = 0; +auto cp_all = [&](int) { ++calls_all; return true; }; +auto cp_take = [&](int) { ++calls_take; return true; }; + +for ([[maybe_unused]] int x : v | std::views::filter(cp_all)) {} +for ([[maybe_unused]] int x : v | std::views::filter(cp_take) | std::views::take(5)) {} + +std::cout << "filter 谓词调用次数: 全量=" << calls_all + << " 加 take(5)=" << calls_take << "\n"; +``` + +On a `v` with ten million elements: + +```bash +filter 谓词调用次数: 全量=10000000 加 take(5)=6 +``` + +**Ten million times vs six times**. After adding `take(5)`, the predicate was only called six times (it takes six checks to retrieve five elements) before stopping, and the remaining ten million evaluations were all short-circuited away by laziness. If you only care about "the first few elements that meet the condition," this approach is more than an order of magnitude faster than "filtering into a complete list first and then taking the first five" — because the latter (eager) must run every element through the predicate. + +## ranges::to: materializing lazy results back into containers (C++23) + +Views are lazy, but often you ultimately want a **concrete container** (for example, when you need random access multiple times, or when passing to an interface that only accepts containers). Materializing a view into a container is the job of `std::ranges::to`: + +```cpp +auto collected = std::vector{1, 2, 3, 4, 5, 6} + | std::views::filter([](int x) { return x % 2 == 0; }) + | std::ranges::to>(); +// collected == {2, 4, 6} +``` + +```bash +❯ ./ranges_to_demo +ranges::to (evens): 2 4 6 +``` + +:::warning There's a version trap here that Shah failed to flag +In his talk, Shah says "we have `ranges::to`" in a tone that implies it's been available alongside constrained algorithms since C++20. **It's not.** `std::ranges::to` only entered the standard in **C++23** (proposal P1206R7, feature test macro `__cpp_lib_ranges_to_container=202202L`), a full version later than the C++20 constrained algorithms. + +I compiled the same program under both standards, and the results speak for themselves: + +```cpp +auto col = v | std::views::filter(pred) | std::ranges::to>(); +``` + +```bash +❯ g++ -std=c++20 probe.cpp +probe.cpp:12:78: error: ‘to’ is not a member of ‘std::ranges’ + 12 | ... | std::ranges::to>(); + | ^~ + +❯ g++ -std=c++23 probe.cpp && echo OK +OK +``` + +`-std=c++20` directly throws a `'to' is not a member of 'std::ranges'`; only `-std=c++23` compiles successfully. So if your project is still on C++20, `ranges::to` won't work — you'll have to manually `reserve` plus loop `push_back`, or use `std::copy` with an inserter. The minimum toolchain versions are roughly GCC 14 / Clang 18+libc++ / MSVC VS2022 17.5. + +:::tip Pipe support is also C++23, not a "later addition" +The pipe syntax like `r | ranges::to()` comes from proposal P2387R3. It landed in C++23 **alongside** P1206, not as "first there was `ranges::to`, and pipe support was patched in later." So you don't need to worry about "the pipe version being a patch" — it was a complete part of C++23 from the start. +::: +::: + +## Views cheat sheet: which standard introduced which + +This is another key addition in this article. Views have continued to expand since C++20, with C++23 adding a large batch and C++26 still adding more. In his talk, Shah broadly labels `drop_while`, `chunk_by`, `zip`, and `zip_transform` as "new things," but **doesn't flag the versions** — these actually belong to different standards, and mixing them up will cause compilation failures. I've listed the version attributions verified against cppreference: + +| Standard | Views (representative) | +|------|------| +| **C++20** | `filter`, `transform`, `take`, `drop`, `take_while`, `drop_while`, `reverse`, `join`, `split`, `keys`, `values`, `elements`, `iota` (infinite), `lazy_split`, `common`, `counted`, `all` | +| **C++23** | `zip`, `zip_transform`, `chunk`, `chunk_by`, `slide`, `join_with`, `stride`, `cartesian_product`, `as_const`, `as_rvalue`, `enumerate`, `adjacent`, `adjacent_transform`, `pairwise`, `pairwise_transform`, `repeat` (factory) | +| **C++26** | `cache_latest` (along with `concat`, `as_input`, `indices` etc. in progress) | + +:::warning A few versions that are easy to misremember + +- **`drop_while` is C++20**, not C++23 — don't relegate it to '23 just because it "looks new." +- **`chunk_by`, `zip`, and `zip_transform` are C++23** (`zip`/`zip_transform` come from P2210, `chunk_by` from P2442), requiring `-std=c++23`. +- **`as_rvalue` is C++23**, very easily misremembered as C++26 — because it sounds "very new," but it actually came in alongside the zip batch. +- **`join` is C++20, but `join_with` is C++23** — don't assume the version with `_with` is C++20. +::: + +Let's test-drive a few C++23 views to get a feel for their power. `chunk_by` groups consecutive equal elements: + +```cpp +std::vector run{1, 1, 2, 3, 3, 3, 4, 5}; +for (auto ch : run | std::views::chunk_by([](int a, int b) { return a == b; })) { + std::cout << '['; + for (int x : ch) std::cout << x; + std::cout << ']'; +} +``` + +```bash +❯ g++ -std=c++23 -O2 chunk.cpp -o chunk && ./chunk +[11][2][333][4][5] +``` + +Consecutive equal elements are each grouped together. `zip` "zips" multiple ranges for parallel traversal, taking the length of the shortest one: + +```cpp +std::vector a{1, 2, 3}; +std::vector b{'x', 'y', 'z'}; +for (auto [x, y] : std::views::zip(a, b)) { + std::cout << '(' << x << y << ')'; +} +``` + +```bash +❯ ./zip_demo +(1x)(2y)(3z) +``` + +Previously, to traverse two containers in parallel, you had to manually write two indices and worry about out-of-bounds access; `zip` turns this into a one-liner pipeline, and you can even directly use structured bindings to unpack the results. These new C++23 views significantly broaden the boundaries of what "expressing data processing pipelines with pipes" can do. + +## Custom iterators: an iterator is just a "pseudo-pointer with replaceable forward logic" + +:::tip This section is advanced and can be skipped +If you want a more solid understanding of "what an iterator really is," you can write one yourself. Below is a minimal singly-linked-list node iterator — it proves that: **the essence of an iterator is simply an object that "can `++`, can `*`, and can be compared," and the forward logic is completely replaceable.** +::: + +```cpp +struct Node +{ + int data; + Node* next; +}; + +struct NodeIterator +{ + Node* current; + + int& operator*() const { return current->data; } + NodeIterator& operator++() { current = current->next; return *this; } + bool operator!=(const NodeIterator& other) const { return current != other.current; } +}; +``` + +As long as these four operations are present (dereference, prefix `++`, inequality comparison, and default-constructible/copyable), it can serve as a forward iterator, plugging into range-based for loops and constrained algorithms. Whether the container internally uses a linked list, a tree, or a graph, it can masquerade as "a pseudo-pointer that can step forward one at a time" on the outside. This is the power of the iterator abstraction — and it's why Ranges chose to build on top of iterators rather than starting from scratch. + +## Pitfall checklist: things to watch out for even with Ranges + +Finally, let's consolidate the pitfalls scattered across the three parts of this series for your review. Ranges make many errors **harder to commit**, but they don't eliminate them: + +1. **`std::advance` does not perform bounds checking** — out-of-bounds access means a segfault; in generic code, check with `std::distance` first. +2. **`begin`/`end` must come from the same container** — `process(f().begin(), f().end())` is UB; store them in named variables. +3. **`list`/`set` iterators do not support `+n`/`-n`** — use the member `sort()` for sorting; don't force `std::sort`. +4. **Views do not own data** — they are merely a view of the underlying range. Once the underlying container is invalidated (due to reallocation, rehashing, or destruction), the view dangles. **Don't let a view's lifetime exceed the container it observes.** +5. **`ranges::to` without a `take` safety net will exhaust memory** — directly `ranges::to()`-ing an infinite `iota` will materialize infinitely and blow up memory; always `take` to limit it first. +6. **`reverse` combined with views over single-pass iterators may fail to compile** — some views require bidirectional iterators; using `reverse` on a single-pass `forward_list` view will cause a compilation failure. +7. **Algorithm error messages aren't necessarily shorter** — ranges use concepts to intercept errors earlier and more accurately, but deeply nested constraint errors can still be quite long; the real benefit is "you can't write certain bugs," not "fewer lines of error output." + +## What we've figured out across these three parts + +From index-based loops in the first part to view pipeline composition in this one, we've walked through the evolution of C++'s abstractions for "iterating and processing data." The core of this part can be distilled into a few points: constrained algorithms let you **pass fewer parameters and avoid mismatching iterator pairs**; the lazy evaluation of views is the soul of Ranges — it **doesn't copy, doesn't precompute, and processes one element through the entire pipeline during iteration**, benchmarking over 3 times faster than eager materialization (7ms vs 23ms) while saving memory; laziness enables **infinite ranges** (`iota`) and **short-circuiting** (adding `take(5)` reduced predicate calls from ten million down to six); `ranges::to` materializes lazy results back into containers, but **it's C++23** — don't be misled by the tone of "we have ranges::to"; views are still evolving, with `chunk_by`/`zip`/`zip_transform` being C++23, and `cache_latest` being C++26. + +Looking back at Shah's statement that "algorithms are essentially loops" — we can now complete the thought: the goal of modern C++ is precisely **to spare you from writing those loops by hand**. Use constrained algorithms to replace hand-written sorting/searching loops, and use view pipelines to replace multi-pass "filter → transform → collect" loops, making your code closer to "describing what you want" rather than "describing how to do it." This is the design philosophy of Ranges. + +If you want to dive deeper, there are a few directions: the concepts article in vol4 can help you understand the constraint system behind ranges; the perfect forwarding and SIMD content in the vol6 performance issue share the same lineage as views' "avoiding unnecessary copies"; and cppreference's [Ranges library](https://en.cppreference.com/w/cpp/ranges) and [Constrained algorithms](https://en.cppreference.com/w/cpp/algorithm/ranges) are the most authoritative cheat sheets. Ranges aren't perfect — issues like iterator invalidation are just harder to trigger, not eliminated — but they genuinely make "writing better, safer, higher-performance data processing code" a lot smoother than in the C++11 era. + + + + + + + + + + + + + diff --git a/documents/en/vol10-open-lecture-notes/cppcon/2025/03-back-to-basics-ranges/index.md b/documents/en/vol10-open-lecture-notes/cppcon/2025/03-back-to-basics-ranges/index.md new file mode 100644 index 000000000..09a6991eb --- /dev/null +++ b/documents/en/vol10-open-lecture-notes/cppcon/2025/03-back-to-basics-ranges/index.md @@ -0,0 +1,39 @@ +--- +title: 'Back to Basics: C++ Ranges' +description: 'CppCon 2025 Talk Notes — Mike Shah: Introduction to C++ Ranges' +conference: cppcon +conference_year: 2025 +talk_title: 'Back to Basics: C++ Ranges' +speaker: Mike Shah +video_youtube: https://www.youtube.com/watch?v=Q434UHWRzI0 +tags: +- cpp-modern +- host +- beginner +difficulty: beginner +platform: host +cpp_standard: +- 20 +- 23 +translation: + source: documents/vol10-open-lecture-notes/cppcon/2025/03-back-to-basics-ranges/index.md + source_hash: bceeba17f4769ca9a7ce58aef867fdfcfb8e6adcaa970930755a2f154fba580a + translated_at: '2026-06-13T02:14:53.269922+00:00' + engine: anthropic + token_count: 217 +--- + + +## Notes + + + From Loops to Iterators: The Abstraction Path for Traversing Data + STL Algorithms in Practice and Iterator Pitfalls + Ranges, Views, and Pipeline Composition: The Power of Lazy Evaluation + diff --git a/documents/en/vol10-open-lecture-notes/cppcon/2025/04-back-to-basics-move-semantics/01-copy-cost-and-motivation.md b/documents/en/vol10-open-lecture-notes/cppcon/2025/04-back-to-basics-move-semantics/01-copy-cost-and-motivation.md new file mode 100644 index 000000000..3c8cf5367 --- /dev/null +++ b/documents/en/vol10-open-lecture-notes/cppcon/2025/04-back-to-basics-move-semantics/01-copy-cost-and-motivation.md @@ -0,0 +1,453 @@ +--- +title: 'The Cost of Copying and the Motivation for Moving: From `swap` to `MyString`' +description: CppCon 2025 talk notes — starting from the three deep copies in `swap`, + we build a `MyString` class by hand, reveal the copy overhead of temporary objects, + and introduce the core motivation behind move semantics. +conference: cppcon +conference_year: 2025 +talk_title: 'Back to Basics: Move Semantics' +speaker: Ben Saks +video_bilibili: https://www.bilibili.com/video/BV1X54y1P7uM +video_youtube: https://www.youtube.com/watch?v=szU5b972F7E +tags: +- cpp-modern +- host +- beginner +difficulty: beginner +platform: host +cpp_standard: +- 11 +- 17 +- 20 +chapter: 4 +order: 1 +translation: + source: documents/vol10-open-lecture-notes/cppcon/2025/04-back-to-basics-move-semantics/01-copy-cost-and-motivation.md + source_hash: cf3162bc082c6000bf4e31a33c7231800ce88afe41214dab463e75576eb3479b + translated_at: '2026-06-13T02:15:28.185262+00:00' + engine: anthropic + token_count: 3123 +--- +# Starting with swap: A Tale of Three Copies + +:::tip +As a side note, this section is based on a secondary discussion of CppCon. The link above points to a video series on YouTube; users in China can watch it via the Bilibili link. +::: + +Copying — not moving, but specifically copying — is a very common operation in C++. But the problem is that many objects (like containers) are expensive to copy in most cases. The introduction of move semantics aims to convert these expensive copy operations into cheap "handoff" operations. + +That sounds great, but what does "handoff" actually mean? We start with an example everyone has seen — the `swap` function. + +## C++03 swap: Three Deep Copies + +If you write a generic swap in C++03 (before move semantics), it looks like this: + +```cpp +template +void swap(T& x, T& y) +{ + T temp(x); // 第1次拷贝:把 x 的值拷贝到 temp + x = y; // 第2次拷贝:把 y 的值拷贝到 x + y = temp; // 第3次拷贝:把 temp 的值拷贝到 y +} +``` + +Each line here, in terms of what actually executes, performs a copy. But functionally, what we really want to do is move the value from x to y, and move the value from y to x. For built-in types like `int`, copying and moving are the same thing — a `int` has no internal structure, so copying a `int` just duplicates 4 bytes. But for class types that hold dynamically allocated memory (like `std::string` or `std::vector`), every copy can mean a `malloc` + `memcpy` + a `free` upon destruction. + +Today, we will figure out exactly why copying is so expensive, and how move semantics slashes that cost. + +The experimental environment for this article is Arch Linux WSL, GCC 16.1.1. Here is the environment info: + +```bash +❯ gcc -v +Using built-in specs. +COLLECT_GCC=gcc +COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-pc-linux-gnu/16.1.1/lto-wrapper +Target: x86_64-pc-linux-gnu +gcc version 16.1.1 20260430 (GCC) + +❯ uname -a +Linux Charliechen 6.18.33.1-microsoft-standard-WSL2 #1 SMP PREEMPT_DYNAMIC ... x86_64 GNU/Linux +``` + +## Building a MyString from Scratch: Seeing Why Copying Is Expensive + +To make the problem crystal clear, we will write a simplified string class ourselves — `MyString`. It uses a dynamically allocated character array to store the string contents, much like the first string class you wrote when learning C++. `std::string` is far more complex than this (it has SSO optimization — short strings are stored directly inside the object without heap allocation), but MyString is enough to expose the overhead of copying. + +As a side note, if I were writing this code today, I would use a `std::unique_ptr` to manage that dynamic array. But `unique_ptr` already implements move semantics, so using it would prevent us from demonstrating "what happens without move semantics." Therefore, I am intentionally using a raw pointer. Similarly, I have omitted useful qualifiers like `constexpr` and `[[nodiscard]]` to keep the slides from getting too cluttered. + +### Basic Structure: Construction and Destruction + +```cpp +#include +#include + +class MyString +{ + std::size_t stored_length_; + char* actual_str_; + +public: + // 构造函数:分配刚好够用的内存 + MyString(const char* s) + : stored_length_(std::strlen(s)) + , actual_str_(new char[stored_length_ + 1]) + { + std::memcpy(actual_str_, s, stored_length_ + 1); + } + + // 析构函数:释放动态数组 + ~MyString() + { + delete[] actual_str_; + } + + // 禁止拷贝和移动(暂时) + MyString(const MyString&) = delete; + MyString& operator=(const MyString&) = delete; + + // 获取内容 + const char* c_str() const { return actual_str_; } + std::size_t size() const { return stored_length_; } +}; +``` + +When we create a `"hello"` string, the memory layout looks roughly like this: `stored_length_` holds 5, and `actual_str_` points to a 6-byte block allocated on the heap (5 characters + the trailing `'\0'`). Upon destruction, `delete[] actual_str_` frees this block. Very straightforward. + +### Copy Constructor: The Necessity of Deep Copy + +Now the problem arises: if I want to create `s2` from `s1` — an independent string with the same value — can I just copy those two data members? + +```cpp +// 危险!浅拷贝会导致 double delete +MyString s1("hello"); +MyString s2(s1); // 如果只拷贝 stored_length_ 和 actual_str_ 指针... +``` + +No. Because if `s2`'s `actual_str_` points to the same memory block, then both `s1` and `s2` will execute `delete[]` on the same block when they destruct — that is a double delete, which is undefined behavior. + +So the copy constructor must perform a **deep copy** — allocate memory exclusive to the new object, then copy the contents over: + +```cpp +// 拷贝构造函数:深拷贝 +MyString(const MyString& other) + : stored_length_(other.stored_length_) + , actual_str_(new char[other.stored_length_ + 1]) +{ + std::memcpy(actual_str_, other.actual_str_, stored_length_ + 1); +} +``` + +This is correct, but the cost is: one `new` (heap allocation) + one `memcpy`. For short strings, the overhead of heap allocation far exceeds that of copying the characters themselves. + +### Copy Assignment Operator: Overwriting an Existing Object + +Copy construction and copy assignment are easily confused because both can use the `=` operator. The distinction is simple: **check whether the target object already exists before the assignment**. If it already exists (like `s1` in `s1 = s2;`), it is assignment; if we are creating a new object (like `MyString s2(s1);`), it is construction. + +The implementation of assignment has one extra step compared to construction — we must clean up the old value first: + +```cpp +// 拷贝赋值运算符 +MyString& operator=(const MyString& other) +{ + if (this != &other) { + delete[] actual_str_; // 清理旧值 + stored_length_ = other.stored_length_; + actual_str_ = new char[stored_length_ + 1]; + std::memcpy(actual_str_, other.actual_str_, stored_length_ + 1); + } + return *this; +} +``` + +Note that we `delete[]` the old array first, then `new` the new array. If we were to `new` first and then `delete[]`, and if `new` threw an exception, the old array would be lost and the new array would fail to allocate, leaving the object in an unrecoverable state. We will not handle exception safety here for now (production code should use the copy-and-swap idiom); let us focus on the core logic first. + +### operator+: The Copy Waste of Temporary Objects + +Now MyString has complete copy operations. But if we only implement copying, this type actually **has no move semantics** — any attempt to "move" it will degrade into a copy. Let us look at the most typical scenario — string concatenation: + +```cpp +// 拼接两个字符串 +MyString operator+(const MyString& lhs, const MyString& rhs) +{ + std::size_t new_len = lhs.size() + rhs.size(); + char* buf = new char[new_len + 1]; + std::memcpy(buf, lhs.c_str(), lhs.size()); + std::memcpy(buf + lhs.size(), rhs.c_str(), rhs.size() + 1); + + MyString result(buf); // 用 buf 构造 result + delete[] buf; // 清理临时缓冲区 + return result; // 返回 result +} +``` + +Wait — there is a problem here. `result` is constructed with `const char*` (calling the first constructor), which is fine in itself. But the problem lies with the **caller**: + +```cpp +MyString s1("ABC"); +MyString s2("DEF"); +MyString s3 = s1 + s2; // 期望得到 "ABCDEF" +``` + +`s1 + s2` returns a temporary `MyString` object (which internally already has a block of allocated heap memory storing `"ABCDEF"`). Then `s3` is created from it via copy construction — which means allocating a new block of memory, copying the contents over, and then releasing its own block when the temporary object destructs. + +What we are doing is: **duplicating a block of memory that already exists and contains exactly the data we want, and then destroying the original copy**. If that is not waste, what is? + +## Let the Experiment Speak: How Expensive Is Copying Really? + +Simply saying "waste" is not intuitive enough. Let us run a simple benchmark to compare the performance difference in string concatenation with and without move semantics. + +```cpp +#include +#include +#include + +// ===== 没有 move 的版本 ===== +class MyStringNoMove +{ + std::size_t len_; + char* str_; + +public: + MyStringNoMove(const char* s) + : len_(std::strlen(s)) + , str_(new char[len_ + 1]) + { + std::memcpy(str_, s, len_ + 1); + } + + ~MyStringNoMove() { delete[] str_; } + + MyStringNoMove(const MyStringNoMove& o) + : len_(o.len_) + , str_(new char[o.len_ + 1]) + { + std::memcpy(str_, o.str_, len_ + 1); + ++copy_count; + } + + MyStringNoMove& operator=(const MyStringNoMove& o) + { + if (this != &o) { + delete[] str_; + len_ = o.len_; + str_ = new char[len_ + 1]; + std::memcpy(str_, o.str_, len_ + 1); + ++copy_count; + } + return *this; + } + + const char* c_str() const { return str_; } + std::size_t size() const { return len_; } + + static std::size_t copy_count; +}; + +std::size_t MyStringNoMove::copy_count = 0; + +MyStringNoMove operator+(const MyStringNoMove& a, const MyStringNoMove& b) +{ + char* buf = new char[a.size() + b.size() + 1]; + std::memcpy(buf, a.c_str(), a.size()); + std::memcpy(buf + a.size(), b.c_str(), b.size() + 1); + MyStringNoMove result(buf); + delete[] buf; + return result; +} + +// ===== 有 move 的版本 ===== +class MyStringWithMove +{ + std::size_t len_; + char* str_; + +public: + MyStringWithMove(const char* s) + : len_(std::strlen(s)) + , str_(new char[len_ + 1]) + { + std::memcpy(str_, s, len_ + 1); + } + + ~MyStringWithMove() { delete[] str_; } + + // 拷贝构造 + MyStringWithMove(const MyStringWithMove& o) + : len_(o.len_) + , str_(new char[o.len_ + 1]) + { + std::memcpy(str_, o.str_, len_ + 1); + ++copy_count; + } + + // 移动构造! + MyStringWithMove(MyStringWithMove&& o) noexcept + : len_(o.len_) + , str_(o.str_) // 直接偷走指针 + { + o.str_ = nullptr; // 防止源对象析构时 delete[] + o.len_ = 0; + ++move_count; + } + + // 拷贝赋值:必须深拷贝。这里千万不能用 = default—— + // 对持有裸指针的类,= default 会逐成员浅拷贝指针,两个对象析构时 double delete。 + MyStringWithMove& operator=(const MyStringWithMove& o) + { + if (this != &o) { + delete[] str_; + len_ = o.len_; + str_ = new char[len_ + 1]; + std::memcpy(str_, o.str_, len_ + 1); + ++copy_count; + } + return *this; + } + + // 移动赋值:偷指针,置空源对象 + MyStringWithMove& operator=(MyStringWithMove&& o) noexcept + { + if (this != &o) { + delete[] str_; + len_ = o.len_; + str_ = o.str_; + o.str_ = nullptr; + o.len_ = 0; + ++move_count; + } + return *this; + } + + const char* c_str() const { return str_ ? str_ : "(null)"; } + std::size_t size() const { return len_; } + + static std::size_t copy_count; + static std::size_t move_count; +}; + +std::size_t MyStringWithMove::copy_count = 0; +std::size_t MyStringWithMove::move_count = 0; + +MyStringWithMove operator+(const MyStringWithMove& a, const MyStringWithMove& b) +{ + char* buf = new char[a.size() + b.size() + 1]; + std::memcpy(buf, a.c_str(), a.size()); + std::memcpy(buf + a.size(), b.c_str(), b.size() + 1); + MyStringWithMove result(buf); + delete[] buf; + return result; +} + +int main() +{ + constexpr int N = 100000; + + // 测试无移动版本 + auto t1 = std::chrono::high_resolution_clock::now(); + { + MyStringNoMove a("Hello"); + for (int i = 0; i < N; ++i) { + MyStringNoMove b("World"); + MyStringNoMove c = a + b; + (void)c; + } + } + auto t2 = std::chrono::high_resolution_clock::now(); + + // 测试有移动版本 + auto t3 = std::chrono::high_resolution_clock::now(); + { + MyStringWithMove a("Hello"); + for (int i = 0; i < N; ++i) { + MyStringWithMove b("World"); + MyStringWithMove c = a + b; + (void)c; + } + } + auto t4 = std::chrono::high_resolution_clock::now(); + + auto ms_nocopy = std::chrono::duration_cast(t2 - t1).count(); + auto ms_withmove = std::chrono::duration_cast(t4 - t3).count(); + + std::cout << "=== 拼接 " << N << " 次 ===\n"; + std::cout << "无移动语义: " << ms_nocopy << " ms, " + << "拷贝次数: " << MyStringNoMove::copy_count << "\n"; + std::cout << "有移动语义: " << ms_withmove << " ms, " + << "拷贝次数: " << MyStringWithMove::copy_count + << ", 移动次数: " << MyStringWithMove::move_count << "\n"; + std::cout << "加速比: " << static_cast(ms_nocopy) + / static_cast(ms_withmove) << "x\n"; + + return 0; +} +``` + +Compile and run: + +```bash +❯ g++ -std=c++20 -O2 -Wall -Wextra bench.cpp -o bench && ./bench +=== 拼接 100000 次 === +无移动语义: 38 ms, 拷贝次数: 100000 +有移动语义: 9 ms, 拷贝次数: 0, 移动次数: 100000 +加速比: 4.22x +``` + +Look — with move semantics, the number of copies is zero; everything becomes move operations. Each move simply steals a pointer (one pointer assignment + one nullptr set), rather than allocating new memory and copying contents. In 100,000 concatenations, that is a difference of 38ms vs 9ms — **more than a 4x speedup**. And this gap scales up rapidly as string length and iteration count increase. + +## The Intuition Behind Move Semantics: Why Not Just Hand It Over? + +Going back to the earlier `s3 = s1 + s2` example. `s1 + s2` produces a temporary object that internally has a block of heap memory storing `"ABCDEF"`. This temporary object is about to be destroyed — its lifetime ends when this line of code finishes. Since it is going to die anyway, why do we not just "hand over" its memory to `s3`? + +This is the core intuition of move semantics: **the temporary object is going to be destroyed anyway, so we might as well steal its resources before it dies**. Specifically: + +1. `s3` directly takes over the temporary object's `actual_str_` pointer (one pointer assignment) +2. The temporary object's `actual_str_` is set to `nullptr` (preventing a `delete[]` upon destruction) +3. When the temporary object destructs, `delete[] nullptr` does nothing + +The entire process involves no `new`, no `memcpy`, and no extra memory allocation. One pointer assignment + one nullptr set, done. + +## std::string's SSO: Why Is Moving Not Always Needed? + +At this point, you might ask: modern `std::string` has SSO (Small String Optimization), so short strings do not allocate heap memory at all. Does move semantics still matter for it? + +Good question. SSO means that if a string is short enough (the threshold in libstdc++ is about 15 characters), the data is stored directly inside the object without heap allocation. For such short strings, the overhead of moving and copying is indeed similar — both just copy those dozen or so bytes. + +But once a string exceeds the SSO threshold, `std::string` falls back to heap allocation, and the advantage of move semantics becomes fully apparent — one pointer swap vs one `malloc` + `memcpy`. Moreover, even for short strings, move semantics allows the compiler to avoid unnecessary copies in more scenarios. + +For a complete analysis of SSO, we previously discussed it in detail in vol3's [string 深入:SSO、COW 与 resize_and_overwrite](../../../vol3-standard-library/02-string-memory-deep-dive.md), so we will not expand on it here. + +## What We Have Figured Out So Far + +Starting from the three deep copies in `swap`, we built a `MyString` class from scratch, saw exactly where the overhead of copying comes from (heap allocation + memory copying), and then used an experiment to prove that move semantics can deliver more than a 4x performance boost. The core intuition is also simple: **the temporary object is going to die anyway, so we might as well steal its resources before it dies**. + +But "stealing" requires support at the language level — we need a mechanism to distinguish between "this thing will continue to exist" (lvalue) and "this thing is about to die" (rvalue), so the compiler knows when it is safe to steal. That is the topic of the next article — lvalues, rvalues, and the reference system. If you are interested in the move semantics article series in vol2, you can check out [右值引用:从拷贝到移动](../../../vol2-modern-features/ch00-move-semantics/01-rvalue-reference.md) first, which has a more systematic explanation. + + + + + + + diff --git a/documents/en/vol10-open-lecture-notes/cppcon/2025/04-back-to-basics-move-semantics/02-lvalue-rvalue-and-references.md b/documents/en/vol10-open-lecture-notes/cppcon/2025/04-back-to-basics-move-semantics/02-lvalue-rvalue-and-references.md new file mode 100644 index 000000000..6db0cba36 --- /dev/null +++ b/documents/en/vol10-open-lecture-notes/cppcon/2025/04-back-to-basics-move-semantics/02-lvalue-rvalue-and-references.md @@ -0,0 +1,419 @@ +--- +title: 'Lvalues, Rvalues, and References: The Type System Foundation of Move Semantics' +description: CppCon 2025 talk notes — from the K&R definition of lvalues and rvalues + to the C++11 value category system, with a detailed look at lvalue reference, `const` + reference binding rules, and rvalue references +conference: cppcon +conference_year: 2025 +talk_title: 'Back to Basics: Move Semantics' +speaker: Ben Saks +video_bilibili: https://www.bilibili.com/video/BV1X54y1P7uM +video_youtube: https://www.youtube.com/watch?v=szU5b972F7E +tags: +- cpp-modern +- host +- beginner +difficulty: beginner +platform: host +cpp_standard: +- 11 +- 17 +- 20 +chapter: 4 +order: 2 +translation: + source: documents/vol10-open-lecture-notes/cppcon/2025/04-back-to-basics-move-semantics/02-lvalue-rvalue-and-references.md + source_hash: ac81d12b8e0b09e31f41614f4a56bc363b1fdf1298f7c85f642befb8254cd351 + translated_at: '2026-06-13T02:16:38.094525+00:00' + engine: anthropic + token_count: 3726 +--- +# Lvalues, Rvalues, and References: The Type System Foundations of Move Semantics + +:::tip +This article is an in-depth adaptation of Ben Saks's "Back to Basics: Move Semantics" talk at CppCon 2025. Above is the YouTube link; users in China can watch the Bilibili version instead. The experimental environment for this article is Arch Linux WSL, GCC 16.1.1. +::: + +In the previous article, we used MyString experiments to prove that move semantics can reduce a heap allocation + memcpy to a single pointer assignment, speeding up 100,000 concatenations by more than 4x. The conclusion was exciting, but we left a key question hanging at the end—how does the compiler know when it's safe to "steal" resources? It needs a language mechanism to distinguish between "this object will still be used" and "this object is about to die." That distinction mechanism is lvalues and rvalues. + +Honestly, I used to have a vague sense of dread about "lvalues and rvalues." The first time I heard those two terms, my instinctive reaction was: "Isn't this just the left side and right side of the equals sign?"—and then I quickly realized things weren't that simple. The `x` in `const int x = 10;` is an lvalue, but you can't assign to it; `int&& r = 10;` is clearly bound to an rvalue, but `r` itself is an lvalue... These seemingly contradictory phenomena took me a while to fully figure out. + +## K&R's Original Definition: Left and Right of the Equals Sign + +The terms lvalue and rvalue trace back to the birth of the C language. K&R introduced these concepts in *The C Programming Language*—the "L" in "L value" comes from the assignment expression `E1 = E2`, where the thing on the **Left** of the assignment operator must have certain specific properties. Specifically, `E1` must be an expression that can be located—the compiler must be able to determine its position in memory so it can write the value of `E2` into it. + +This is the most primitive intuition: **lvalue = something that can appear on the left side of an assignment**. + +Take the simplest example: + +```cpp +int n = 1; // OK: n 是左值,1 是右值 +n = 2; // OK: n 是左值,可以出现在赋值左边 +// 1 = n; // 错误!1 是右值,不能出现在赋值左边 +``` + +`n` is a named variable; it has a definite location in memory, the compiler knows its address, so a value can be written into it. But the literals `1` and `2`—they are pure values, the compiler doesn't allocate a writable memory address for them. You can't tell the compiler "please write n's value into the number 1," because the number 1 simply doesn't have an "inside." + +This is the first level of understanding lvalues and rvalues. At this level, everything looks fine—lvalues are "things with an address that can be assigned to," and rvalues are "things without an address that can't be assigned to." + +But wait—have you noticed that this definition carries an implicit assumption? It assumes that "can appear on the left side of an assignment" and "has a memory address" are the same thing. In the very early days of C, this assumption basically held. But C soon introduced `const`, and C++ introduced references, class types, temporary objects... As the language grew more complex, this assumption started to fall apart. Next, we'll see how this crack appeared, and why understanding it is crucial for move semantics. + +## Basic Classification: Literals and Named Variables + +Before we start patching those cracks, let's get the basic classification straight, because these rules haven't changed from the C era to today. + +**Literals are rvalues.** Integer literals like `3`, floating-point literals like `3.14`, character literals like `'a'`, enumeration constants—they are all rvalues. They have no memory address (at least not from the programmer's perspective), you can't assign to them, they are simply "values" themselves. + +**Named variables are lvalues.** `int n;` declares a variable `n` that has a location in memory; you can both read from and write to it. The key point is: an lvalue can appear on **either side** of an assignment expression. In `n = 1`, `n` is on the left (being written to); in `m = n`, `n` is on the right (being read). But what happens when `n` is on the right? It gets read—the compiler extracts the value stored at `n`'s memory location. This "read" operation has a formal name: **lvalue-to-rvalue conversion**. + +This conversion is almost everywhere, we just don't usually notice it. Whenever you write `int b = a;`, `a` is an lvalue, but to assign it to `b`, the compiler must first read out the value stored in `a`—this step is the lvalue-to-rvalue conversion. Understanding that this conversion exists is important because it explains a subtle fact: **lvalues and rvalues are not two kinds of "things," but two "properties" of expressions**. The same variable `a` can exhibit lvalue properties or rvalue properties in different contexts. + +## const Objects: The First Crack in K&R's Definition + +Now here's the problem. Look at this code: + +```cpp +const int max = 100; +// max = 200; // 错误!max 是 const,不能赋值 +printf("&max = %p\n", (void*)&max); // 但 max 有地址! +``` + +`max` is a const object. You can't assign to it—`max = 200` is a compiler error. According to K&R's definition of "lvalue = can appear on the left side of an assignment," `max` shouldn't be an lvalue. But in reality, `max` does have a memory address; you can take its pointer (`&max` is legal), and you can read its value through that pointer. + +This is the crack in K&R's definition: **const objects are lvalues, but are not assignable**. The standard terminology calls them "non-modifiable lvalues." + +This distinction is very important because it reveals the true core of the lvalue concept—**having an address**, not **being assignable**. A `const int` object has an address but is not assignable; an integer literal `3` has neither an address nor is it assignable. The former is a non-modifiable lvalue, the latter is an rvalue. The key to distinguishing them isn't "can you assign to it," but "does it have a persistent memory location." + +The actual output from GCC 16.1.1 confirms this: + +```text +max = 100 +&max = 0x7ffc47a05dc8 +``` + +`&max` prints a valid stack address—this const object genuinely exists in memory. + +We can draw a comparison here to deepen our understanding. The `max` in `const int max = 100;` is a non-modifiable lvalue: it has an address, you can't assign to it, but you can take its address and read through a pointer. The literal `100` is an rvalue: it has no address, and you can't assign to it either. What they share is "not assignable," but the crucial difference lies in "having a persistent memory location." This difference becomes very important when we get to class types and reference binding—because the compiler uses "having a persistent location" to decide which references can bind to which expressions. + +## Class-Type Rvalues: Can Call Member Functions + +The distinction between lvalues and rvalues gets more interesting with class types. Consider a simple struct: + +```cpp +struct Widget +{ + int value; + void f() + { + // this 指向调用对象的地址 + printf("Widget::f(), value = %d, this = %p\n", value, (void*)this); + } +}; +``` + +We have two ways to get a class-type rvalue. The first is a function return value: a function that returns a `Widget` by value has a class rvalue as its return value. The second is functional-style cast: `Widget(7)` converts the integer 7 into a temporary object of type `Widget`, which is also a class rvalue. + +The interesting part is: **you can call member functions on a class rvalue**. + +```cpp +Widget(7).f(); // OK!在临时 Widget 上调用 f() +make_widget(42).f(); // OK!在函数返回的临时对象上调用 f() +``` + +This seems a bit strange—isn't an rvalue something "without an address"? How can you call a member function on something without an address? The answer is that the compiler does something behind the scenes: it allocates a location in memory for this temporary object—the standard calls this process **temporary materialization conversion**. The `this` pointer points to that temporarily allocated memory location. + +I ran this on GCC 16.1.1, and the results are quite interesting: + +```text +Widget::f(), value = 7, this = 0x7ffc9a466b04 +Widget::f(), value = 42, this = 0x7ffc9a466b04 +``` + +Notice—the `this` addresses from both calls are exactly the same! This is because the compiler applied NRVO (Named Return Value Optimization), placing the temporary object returned by `make_widget` directly in the caller's stack space, and the temporary object for `Widget(7)` happened to be allocated in the same region. These temporary objects have short lifetimes, but they do have real memory locations while they're alive. + +:::warning The version history of temporary materialization—two things need to be distinguished here +Saying "rvalues have no address" isn't quite accurate. The precise statement is—an rvalue **doesn't need** to have an address; it is not a persistent memory location. But if the compiler temporarily allocates a block of memory for it to implement some operation (like calling a member function, or binding to a reference), then in that instant it "has an address." This process of the compiler implicitly allocating memory is temporary materialization. + +As for its version history, we need to separate two things: the lvalue / xvalue / prvalue **value category triad** was indeed introduced in C++11; but "**temporary materialization conversion**" as a named standard conversion was only formally established in **C++17**. It was written into the language rules alongside C++17's mandatory copy elision (proposal P0135), with the core idea being: **a prvalue isn't necessarily an object itself; it only "materializes" into a temporary object when it needs to be used as one (like calling a member function, or binding to a reference)**. In the C++11 era, this mechanism was still gestating and hadn't been formally named. So strictly speaking, the temporary materialization in `Widget(7).f()` above is standard semantics only from C++17 onward—don't conflate it with C++11's value category triad. +::: + +:::warning +Class rvalues being able to call member functions is the foundation of move semantics. Move constructors and move assignment operators are essentially "member functions called on temporary objects about to be destroyed"—through rvalue references, we gain the ability to modify these temporary objects. +::: + +## Lvalue References: The First Rule of Binding + +Now we enter the world of references. Before C++11 introduced rvalue references, what C++ called a "reference" was what we now formally call an "lvalue reference." + +"An lvalue reference to T must bind to a T-type lvalue"—this sentence sounds convoluted, but the meaning is simple. A reference of type `int&` can only bind to an lvalue of type `int`: + +```cpp +int n = 10; +int& ri = n; // OK: ri 绑定到左值 n +// int& ri2 = 10; // 错误!不能把左值引用绑定到右值(字面量) +``` + +Why is `int& ri = 10` an error? Because `10` is an rvalue; it has no persistent memory location. A reference needs to know the address of what it's referencing, but an rvalue has no address—hence the contradiction. + +But there's a very important exception here: **a const lvalue reference can bind to an rvalue**. + +```cpp +const int& cri = 10; // OK!const 引用可以绑定到右值 +const int& cri2 = 3.14; // OK!甚至可以绑定到不同类型(double -> int 转换) +``` + +The mechanism behind this is: the compiler quietly creates a temporary `int` object to store that value (or the converted value), and then lets the const reference bind to this temporary object. For `const int& cri2 = 3.14;`, the compiler first does the conversion from `double` to `int` (3.14 becomes 3), creates a temporary `int` holding 3, and then `cri2` binds to this temporary object. That's why I saw `const lvalue ref to converted: 3` in the GCC output—3.14 was truncated. + +You might ask: why must it be `const`? Because if you allowed a non-const reference to bind to an rvalue, you could modify a temporary object through that reference—and that temporary object might be destroyed immediately, making the modification pointless and prone to bugs. A const reference binding to a temporary object means you can only read it, not modify it, so it's safe. + +This rule has another important corollary: **a const reference extends the lifetime of a temporary object**. Normally, the temporary object in `Widget(7).f()` would be destroyed after the statement ends. But if a const reference binds to it, the temporary object's lifetime is extended to be as long as the reference. + +Here's a concrete example to show how important this is. Suppose you wrote a function that returns a `std::string`, and you receive it with a const reference: + +```cpp +std::string get_name() { return "hello"; } + +const std::string& name = get_name(); +// name 在这里仍然有效!临时对象的生命周期被延长了 +printf("%s\n", name.c_str()); // 安全 +``` + +Without the const reference's lifetime extension rule, the temporary `std::string` returned by `get_name()` would be destroyed after the statement ends, and `name` would become a dangling reference. But because `const std::string&` binds to this temporary object, the compiler guarantees the temporary lives at least until `name` goes out of scope. + +There's a subtle pitfall here, though—only the "first" reference that directly binds to the temporary object extends its lifetime; indirect binding through a reference chain doesn't count. For example, in `const std::string& r2 = name;`, `r2` binds to `name` (an lvalue), which doesn't involve a temporary object, so there's no lifetime extension. But if you have a situation involving multiple levels of indirect binding to a temporary object, you need to be careful. We discuss this in more detail in vol2's [Rvalue References: From Copy to Move](../../../vol2-modern-features/ch00-move-semantics/01-rvalue-reference.md). + +:::warning +Note: An rvalue reference `T&&` also has the effect of extending a temporary object's lifetime. `std::string&& r = get_name();` will also keep the returned temporary object alive until `r` goes out of scope. This is a commonality between rvalue references and const lvalue references—they can both bind to temporary objects and extend their lifetimes. The difference is that an rvalue reference allows you to modify the temporary object, while a const lvalue reference does not. +::: + +## Rvalue References: Born for Move Semantics + +C++11 introduced a new reference type—the rvalue reference, denoted with double `&&` syntax. + +```cpp +int&& ri = 10; // OK: 右值引用绑定到右值(字面量 10) +// int&& ri2 = n; // 错误!右值引用不能绑定到左值 +``` + +The binding rules for rvalue references are the "reverse" of lvalue references: `int&&` can only bind to an rvalue of type `int`. `int&& ri2 = n` is a compiler error because `n` is an lvalue. + +:::warning +Even `const int&&` can only bind to rvalues—adding const to an rvalue reference doesn't suddenly let it bind to lvalues. This point is often confused. const rvalue references are almost never seen in practice, and the standard library has virtually no use cases for them, but they do exist. +::: + +What's the actual use of rvalue references? The key point is this: **through an rvalue reference, we can modify temporary objects**. + +```cpp +int&& ri = 10; // 编译器为字面量 10 创建一个临时 int 对象 +ri = 20; // OK!我们修改了这个临时对象 +``` + +For simple types like `int`, this has no practical significance. But when we talk about class types—imagine a `MyString&&` that binds to a temporary `MyString` object, and that temporary object internally has a dynamically allocated character array. Through this rvalue reference, we can directly "steal" the pointer to that array, set the temporary object's pointer to `nullptr`, and then let the temporary object's destructor do nothing. + +This is exactly what the signatures of move constructors and move assignment operators express: they receive parameters through rvalue references, telling the compiler "I know this is a temporary object, and I can safely steal its resources." But that's the topic for the next article; let's first finish completing our understanding of the reference system. + +You might also ask a more fundamental question: why did C++11 introduce an entirely new reference type to do this? Why not just reuse lvalue references? The answer is: if the move constructor's signature were `MyString(MyString& s)`, it would create ambiguity with the copy constructor `MyString(const MyString& s)`—actually, no, it wouldn't be ambiguous because the const is different. But the real problem is: if a function accepts both `MyString&` and `const MyString&`, when the compiler sees `s1 + s2` (an rvalue), it can't find a matching non-const lvalue reference to bind to it, so it still can't trigger a "move." Rvalue references fill this gap: they're specifically designed to bind to rvalues, and their binding rules don't overlap with lvalue references, so overload resolution can automatically distinguish between "this is a persistent object (copy it)" and "this is a temporary object (steal its resources)." + +## C++11's Value Category System: lvalue, xvalue, prvalue + +So far I've been talking about just two categories, "lvalue" and "rvalue," as if the whole world were black and white. But in reality, to support move semantics, C++11 expanded the value category system from binary to ternary. + +Before C++11, every expression was either an lvalue or an rvalue—simple as that. But C++11 introduced a third category: **xvalue (expiring value)**. An xvalue represents "this object is about to expire, and its resources can be moved." + +The new classification system works like this. First, all expressions are categorized along two dimensions: "has identity" (can determine a memory location) and "can be moved": + +| Category | Has Identity | Can Be Moved | Examples | +|------|:--------:|:----------:|------| +| **lvalue** | Yes | No | Named variable `n`, `*p`, `++i` | +| **xvalue** | Yes | Yes | Result of `std::move(n)` | +| **prvalue** | No | Yes | Literal `42`, `Widget(7)`, temporary object returned by a function | + +Then there are two composite concepts: **glvalue** (generalized lvalue) = lvalue + xvalue, **rvalue** = xvalue + prvalue. Represented as a diagram: + +```text + 表达式 + / \ + glvalue rvalue + / \ / \ + lvalue xvalue prvalue +``` + +- **lvalue**: Has identity, cannot be moved—ordinary named variables. +- **xvalue**: Has identity, can be moved—the return value of `std::move(x)`. It has a name (or rather, a definite memory location), but the compiler is told "you can move its resources away." +- **prvalue** (pure rvalue): No identity, can be moved—pure temporary values, like literals and temporary objects returned by functions. + +This system looks considerably more complex than the binary classification, but its design logic is clear: move semantics needs a mechanism to express "this thing's resources can be stolen," and xvalue is that bridge. What `std::move` essentially does is convert an lvalue into an xvalue, telling the compiler "although this object still has a name, you can move its resources away." + +### Value Categories of Common Expressions + +Looking at just the definitions might still feel abstract, so let's list the most common expressions we use in daily coding and mark which category each belongs to: + +| Expression | Value Category | Reason | +|--------|--------|------| +| `n` (named variable) | lvalue | Has a name, has a definite memory location | +| `*p` (dereference) | lvalue | The object pointed to has a memory location | +| `++i` (pre-increment) | lvalue | Returns the modified `i` itself | +| `i++` (post-increment) | prvalue | Returns a copy of the old value, a temporary | +| `42` (integer literal) | prvalue | Pure value with no memory location | +| `"hello"` (string literal) | lvalue | String literals are const char arrays with an address | +| `Widget(7)` (functional-style cast) | prvalue | Creates a temporary Widget object | +| `make_widget()` (return by value) | prvalue | Temporary value returned by a function | +| `std::move(n)` | xvalue | Explicitly converts an lvalue to a "movable" state | +| `a.m` (member access, a is lvalue) | lvalue | Follows the identity property of `a` | +| `std::move(a).m` (member access, a is xvalue) | xvalue | Follows the xvalue property of `a` | + +A few points are worth special attention. The string literal `"hello"` is an lvalue, which often surprises people—it's actually an array of type `const char[6]`, stored in the program's read-only data segment, has a definite address, and is therefore an lvalue. Post-increment `++` returns a copy of the old value (a temporary), so it's a prvalue; while pre-increment `++` returns the modified object itself, so it's an lvalue. The value category of the member access expression `a.m` follows the value category of `a`—if `a` is an lvalue, `a.m` is an lvalue; if `a` is an xvalue, `a.m` is an xvalue. + +## Verifying Value Categories with the Compiler + +We've discussed a lot of theory; now let's actually verify things using `decltype` and type traits. `decltype` has a useful property: when applied to a **parenthesized** variable name `decltype((x))`, it gives different types depending on the expression's value category—lvalues yield `T&`, xvalues yield `T&&`, and prvalues yield `T`. + +```cpp +#include +#include +#include + +template +void print_category() +{ + printf(" is lvalue ref: %s\n", + std::is_lvalue_reference_v ? "yes" : "no"); + printf(" is rvalue ref: %s\n", + std::is_rvalue_reference_v ? "yes" : "no"); +} + +int main() +{ + int n = 10; + + printf("decltype((n)):\n"); // n 是 lvalue + print_category(); // int& → lvalue ref: yes + + printf("decltype(10):\n"); // 10 是 prvalue + print_category(); // int → 都不是引用 + + printf("decltype(std::move(n)):\n"); // std::move(n) 是 xvalue + print_category(); // int&& → rvalue ref: yes + + return 0; +} +``` + +The output from GCC 16.1.1 perfectly confirms the theory: + +```text +decltype((n)): + is lvalue ref: yes + is rvalue ref: no +decltype(10): + is lvalue ref: no + is rvalue ref: no +decltype(std::move(n)): + is lvalue ref: no + is rvalue ref: yes +``` + +`decltype((n))` yields `int&` because `(n)` is an lvalue expression. `decltype(10)` yields `int` (the bare type) because `10` is a prvalue. `decltype(std::move(n))` yields `int&&` because the return value of `std::move` is an xvalue, and xvalues manifest as `T&&` in `decltype`. + +## "If It Has a Name, It's an Lvalue"—The Trap of Rvalue Reference Parameters + +Now it's time to talk about a pitfall that almost every C++ newcomer falls into. Ben Saks specifically emphasized this rule in his talk: **if something has a name, it's an lvalue**. + +Consider a function that receives an rvalue reference: + +```cpp +void process(MyString&& s) +{ + // 在这里,s 是左值还是右值? +} +``` + +From outside the function, when you call `process(s1 + s2)`, `s1 + s2` is an rvalue, so this call is fine—an rvalue reference can bind to an rvalue. But **inside** the function, the parameter `s` has a name. It's a named object. According to the "if it has a name, it's an lvalue" rule, **within the function body, `s` is treated as an lvalue**. + +What does this mean? If you want to move resources from `s` again inside the function body, you can't do it directly—the compiler will treat `s` as an lvalue and choose copy instead of move. You must explicitly use `std::move(s)` to tell the compiler "I know what I'm doing, please treat it as an rvalue." + +```cpp +void process(MyString&& s) +{ + MyString copy(s); // 拷贝!因为 s 在这里是左值 + MyString moved(std::move(s)); // 移动!std::move 把 s 转为右值 +} +``` + +The logic behind this rule is actually quite reasonable: the function body might have many lines of code, and `s` might still be used on line ten after being moved on line one. The compiler can't assume "you only use it on the last line," so it chooses the conservative strategy—things with names aren't automatically moved; you must explicitly authorize it. + +:::tip +This "name = lvalue" rule can be verified with `decltype`. If you write `decltype((s))` in a function template, when `s`'s declared type is `MyString&&`, `decltype((s))` will still yield `MyString&` (lvalue reference), not `MyString&&`. Because the parenthesized `decltype` looks at the expression's value category, and `s` as a named object has the value category lvalue. This is often used to set traps in interview questions. +::: + +:::tip +This "if it has a name, it's an lvalue" rule has one important exception: **return statements**. The `s` in `return s;` has a name, but since C++11 it's treated as an "implicitly movable entity," and the compiler can directly move from it without you needing to write `std::move(s)`. And in fact, the compiler might do even better—eliminating the copy entirely through NRVO. We'll save the full discussion of this topic for the next article. +::: + +## Reference Binding Rules Cheat Sheet + +Let's organize all the reference binding rules covered in this article into a single table for easy reference: + +| Reference Type | Can Bind to lvalue? | Can Bind to rvalue? | Can Bind to Different Type? | Can Modify Referenced Object? | +|----------|:-----------------:|:-----------------:|:------------------:|:-----------------:| +| `T&` | Yes | **No** | No | Yes | +| `const T&` | Yes | **Yes** | Yes (with conversion) | No | +| `T&&` | **No** | Yes | No | Yes | +| `const T&&` | **No** | Yes | No | No | + +This table packs in a lot of information, but a few key conclusions are worth remembering. First, `const T&` is a "universal receiver"—it can bind to almost anything (lvalue, rvalue, even different types), at the cost of not being able to modify the referenced object through it. Second, `T&&` only binds to rvalues, which is exactly what move semantics needs: it guarantees that what's bound is always an object whose "resources can be safely stolen." Third, `const T&&` exists but is virtually useless—it can bind to rvalues but can't modify them, which loses the core advantage of rvalue references: "allowing modification of temporary objects." + +## What We've Figured Out So Far + +In this article, starting from K&R's "left side of the equals sign," we step by step built the complete picture of C++ value categories. We saw how const objects broke the old definition of "lvalue = assignable," how class rvalues gain memory locations through temporary materialization, how lvalue references and rvalue references have starkly different binding rules, and finally how we found the theoretical foundation for move semantics in C++11's lvalue/xvalue/prvalue ternary system. + +The core takeaways are two: first, an rvalue reference `T&&` only binds to rvalues, which gives the compiler a natural signal—"the thing bound to it is temporary, and its resources can be safely stolen." Second, the "if it has a name, it's an lvalue" rule means we sometimes need `std::move` to explicitly tell the compiler "please allow moving." + +Looking back, the distinction between lvalues and rvalues wasn't invented out of thin air by C++11—it has existed since the C language era, just in a much simpler form. C++ introduced const, class types, references, operator overloading, and each step made the boundaries of value categories more blurred, until move semantics needed a precise mechanism to distinguish "persistent" from "temporary" objects, and C++11 finally formalized this system into the three-level classification of lvalue/xvalue/prvalue. Understanding the evolutionary logic of this system will make learning `std::move`, move constructors, perfect forwarding, and other concepts much smoother—because their designs are all responding to the same question: "How does the compiler know whether this object can be safely moved?" + +With this theoretical foundation, in the next article we can move into practice—implementing a move constructor and move assignment operator for MyString, seeing exactly how `std::move` works, and under what conditions copy elision lets us skip moving entirely. + +If you want a more systematic explanation of rvalue references, vol2's [Rvalue References: From Copy to Move](../../../vol2-modern-features/ch00-move-semantics/01-rvalue-reference.md) is excellent supplementary material. + + + + + + + + + diff --git a/documents/en/vol10-open-lecture-notes/cppcon/2025/04-back-to-basics-move-semantics/03-move-ops-stdmove-and-elision.md b/documents/en/vol10-open-lecture-notes/cppcon/2025/04-back-to-basics-move-semantics/03-move-ops-stdmove-and-elision.md new file mode 100644 index 000000000..466a8b2c1 --- /dev/null +++ b/documents/en/vol10-open-lecture-notes/cppcon/2025/04-back-to-basics-move-semantics/03-move-ops-stdmove-and-elision.md @@ -0,0 +1,591 @@ +--- +title: Move Operations, std::move, and Copy Elision +description: CppCon 2025 Talk Notes — Complete Implementation of Move Construction/Assignment, + The True Meaning of std::move, NRVO and C++17 Mandatory Copy Elision, and Moved-From + State +conference: cppcon +conference_year: 2025 +talk_title: 'Back to Basics: Move Semantics' +speaker: Ben Saks +video_bilibili: https://www.bilibili.com/video/BV1X54y1P7uM +video_youtube: https://www.youtube.com/watch?v=szU5b972F7E +tags: +- cpp-modern +- host +- beginner +difficulty: beginner +platform: host +cpp_standard: +- 11 +- 17 +- 20 +chapter: 4 +order: 3 +translation: + source: documents/vol10-open-lecture-notes/cppcon/2025/04-back-to-basics-move-semantics/03-move-ops-stdmove-and-elision.md + source_hash: 202e126a92dd9bcd611e0fd3c61f57e908558f75363044ec384e65e702c49e25 + translated_at: '2026-06-13T02:18:18.007260+00:00' + engine: anthropic + token_count: 4572 +--- +# Move Operations, std::move, and Copy Elision + +:::tip +This article is the third in a series of notes from CppCon 2025's "Back to Basics: Move Semantics" talk. The first two articles discussed copy overhead and the motivation for moving, as well as lvalues, rvalues, and the reference system. This installment focuses on a core practical question: how to write move constructors and move assignment operators, what `std::move` actually does, and how C++17's copy elision changes the game. +::: + +Honestly, I used to think I "understood" move semantics — isn't it just stealing pointers, how hard could it be? Until one day in a code review, I saw a colleague write `return std::move(result);`, and I casually said, "Nice, explicitly moved." Then a senior engineer next to me shut me down with one sentence: **"Are you sure writing it that way won't prevent NRVO?"** + +It took me a whole evening to figure it out — `return std::move(result)` doesn't help you optimize at all. Instead, it turns a return value transfer that the compiler could have done at zero cost into an extra move construction. From that day on, I truly realized that the devil of move semantics is entirely in the details. + +In this article, we will break down these details one by one. Our test environment is Arch Linux WSL, GCC 16.1.1, with the compiler flag `-std=c++20`. If you plan to follow along and run the code, we recommend having this version or a newer compiler ready. + +## Move Constructors: The Art of Stealing Pointers + +In the previous article, we already had complete `MyString` copy operations. Now let's add a move constructor. What this function does, in Ben Saks' words, is a **"destructive copy"** — we "steal" the source object's data, and then leave the source object in a harmless state. + +```cpp +class MyString +{ + std::size_t stored_length_; + char* actual_str_; + +public: + // ... 之前的构造函数、析构函数、拷贝操作 ... + + // 移动构造函数 + MyString(MyString&& s) noexcept + : stored_length_(s.stored_length_) + , actual_str_(s.actual_str_) + { + s.actual_str_ = nullptr; + s.stored_length_ = 0; + } +}; +``` + +Let's break down this code line by line, because every line exists for a reason. + +First is the parameter type `MyString&& s` — this is an rvalue reference. An rvalue reference can only bind to an rvalue (a temporary object, the result of `std::move`, etc.), which means this constructor is only called when the compiler confirms that "the source object is about to die." This is the first layer of safety guarantee in move semantics: the compiler gates it for you through overload resolution. + +Next is the initializer list. `stored_length_(s.stored_length_)` directly takes the source object's length — `std::size_t` is a built-in type, so the so-called "copy" is just an integer assignment, at nearly zero cost. `actual_str_(s.actual_str_)` is the key part: we directly assign the source object's pointer to the new object, so the new object now points to the heap memory previously allocated by the source object. So far, both objects point to the same memory — if we ended here, that would be a double delete, which is undefined behavior (UB). + +So the two lines in the function body are the soul. `s.actual_str_ = nullptr` nullifies the source object's pointer, and `s.stored_length_ = 0` resets the length to zero. This way, when the source object's destructor executes `delete[] actual_str_`, it actually calls `delete[] nullptr` — and the standard explicitly states that deleting a null pointer is a safe no-op. + +You might have noticed that even though the move constructor's parameter `s` is an rvalue reference, `s`'s destructor will still be called. This is a point many people overlook: a move operation does not mean "once you take over, you don't need to care about the source object anymore." On the contrary, after the move is complete, the source object is still a complete, valid object — it's just that we intentionally set its internal state to "harmless" values. It will still be destructed normally, except that nothing will be freed during destruction. + +## Overload Resolution: How Does the Compiler Choose? + +With both copy and move constructor versions available, how does the compiler choose when facing an initialization expression? The answer is overload resolution based on the value category of the argument . + +```cpp +MyString s1("hello"); + +// s1 是左值(有名字)→ 调用拷贝构造函数 +MyString s2(s1); + +// std::move(s1) 是右值 → 调用移动构造函数 +MyString s3(std::move(s1)); +``` + +In the first line, `MyString s2(s1)`, `s1` is an lvalue — it has a name, and you can take its address. The compiler sees that the argument is an lvalue, looks for a constructor that accepts `const MyString&`, and hits the copy constructor. + +In the second line, `MyString s3(std::move(s1))`, the result of `std::move(s1)` is an rvalue reference. The compiler looks for a constructor that accepts `MyString&&`, and hits the move constructor. This is why we need both constructors to coexist: the copy constructor handles the case where "the source object will continue to be used," and the move constructor handles the case where "the source object is going to die anyway." + +Ben Saks particularly emphasized one point in his talk: **an rvalue reference does not perform a move by itself**. It merely provides a signal to the compiler at the type system level — "this reference is bound to an rvalue." What actually decides between copy and move is overload resolution. If our `MyString` didn't have a move constructor, then `std::move(s1)` would only trigger the copy constructor too — the compiler would fall back to using the `const MyString&` version, because `MyString&&` can be received by `const MyString&`. It won't error out, but it won't move either. We'll mention this point again later. + +## Move Assignment Operators: Clean Up the Old Object First + +Move constructors handle the "creating a new object" scenario, while move assignment handles the "overwriting an existing object" scenario. The core logic of both is very similar, but move assignment has an extra step — you must clean up the target object's old resources first. + +```cpp +MyString& operator=(MyString&& s) noexcept +{ + if (this != &s) { + delete[] actual_str_; // 第一步:清理自己的旧资源 + stored_length_ = s.stored_length_; + actual_str_ = s.actual_str_; // 第二步:偷源对象的资源 + s.actual_str_ = nullptr; // 第三步:置空源对象 + s.stored_length_ = 0; + } + return *this; +} +``` + +This order is important. We first `delete[] actual_str_` release our own previous heap memory, and then take over the source object's pointer. If we did it the other way around — assigning first and then deleting — we would delete the pointer that the source object just gave us, which is a classic use-after-free. + +The self-assignment check `if (this != &s)` is equally important in move assignment. Although `s` is an rvalue reference and theoretically nobody should write code like `x = std::move(x)`, the language doesn't prohibit it, and sometimes template instantiation can produce this effect. Without the self-assignment check, `delete[] actual_str_` would release our own memory, and then `actual_str_ = s.actual_str_` would assign a dangling pointer back to ourselves — instant crash. + +Note that the return type is `MyString&` — an lvalue reference, not an rvalue reference. This is because the target of the assignment operator (the object on the left side of `=`) is always an lvalue. Whether you use `std::move` or not, the receiving end of an assignment is always "an object with a name and an address." + +Additionally, this implementation is exception-safe — `MyString`'s data members are only built-in types (`std::size_t` and `char*`), and operations on these types won't throw exceptions. This is also why I marked it `noexcept`. If your class has more complex data members (such as another `std::string`), you would need to carefully consider exception safety. + +## std::move: The Most Misunderstood Function in C++ + +The name `std::move` is terribly misleading. When I first saw it, I naturally assumed it "performed a move operation" — after all, it's called "move." But the truth is, **`std::move` doesn't move anything at all**. + +Its real identity is a type cast to an rvalue reference. The standard library's implementation is roughly equivalent to: + +```cpp +template +constexpr typename std::remove_reference::type&& move(T&& t) noexcept +{ + return static_cast::type&&>(t); +} +``` + +Ignoring the template metaprogramming gymnastics of `remove_reference`, the core is just `static_cast(t)`. It casts the passed-in argument to an rvalue reference and returns it. That's it. It doesn't generate any move code, doesn't call any move constructor, and doesn't modify any object's state. + +Ben Saks said something very true in his talk: **if we could start over, we'd probably call it `make_movable` or `as_rvalue`**. At least that name wouldn't mislead people into thinking it performs a move. + +### Why We Need std::move: The Naming Trap in swap + +So if `std::move` doesn't move, why do we still need it? Let's look at the `swap` function. This is the scenario that best illustrates the point. + +```cpp +template +void swap(T& x, T& y) +{ + T temp(x); // (1) + x = y; // (2) + y = temp; // (3) +} +``` + +This C++03-style `swap` performs three copies. We naturally want to change it to a move version — after all, our previous two articles kept saying that moving is much faster than copying. But here's the problem: `x`, `y`, and `temp` inside the function body are all lvalues. They all have names, you can take their addresses, and their lifetimes span multiple statements. The compiler can't automatically treat them as rvalues — what if you still use `temp` after the third line? + +C++ has a general rule: **if something has a name, it's an lvalue**. Only nameless things (like temporary objects, literals, or by-value function return results) can be rvalues. This rule is very reasonable — the compiler must be conservative; it can't assume that `temp` won't be used on the next line. + +So we need to explicitly tell the compiler: "I know `temp` won't be used again after this, please treat it as an rvalue." This is exactly the purpose of `std::move`: + +```cpp +template +void move_swap(T& x, T& y) +{ + T temp(std::move(x)); // 移动构造 temp + x = std::move(y); // 移动赋值 x + y = std::move(temp); // 移动赋值 y +} +``` + +Every `std::move` sends a message to the compiler: **"Here, I confirm it's safe to move resources from this object."** Only after receiving this information will the compiler select the move version during overload resolution. + +### std::move Doesn't Guarantee a Move + +There's another easily overlooked trap: `std::move` doesn't guarantee that a move will actually happen. If a type only has copy operations and no move operations, the result of `std::move` will degrade to a copy. + +```cpp +struct CopyOnly +{ + CopyOnly() = default; + CopyOnly(const CopyOnly&) { std::cout << "copy\n"; } + // 没有移动构造函数! +}; + +CopyOnly a; +CopyOnly b(std::move(a)); // 输出 "copy" —— 退化为拷贝构造 +``` + +Here, `std::move(a)` converts `a` to an rvalue reference, but `CopyOnly` doesn't have a constructor that accepts an rvalue reference. The compiler falls back to using the `const CopyOnly&` version of the copy constructor (because `CopyOnly&&` can bind to `const CopyOnly&`). It won't error out, but the "move" you expected silently becomes a "copy." + +## The Naming Paradox of Rvalue Reference Parameters + +This is the most confusing part of move semantics, and it's something Ben Saks spent considerable time emphasizing. + +When we write a function that takes an rvalue reference parameter, that parameter is treated as an **lvalue** inside the function: + +```cpp +void process(MyString&& s) +{ + // s 有名字 → s 是左值 + MyString copy(s); // 调用拷贝构造!不是移动构造! + MyString moved(std::move(s)); // 这才调用移动构造 +} +``` + +From the perspective outside the function, the argument passed in is an rvalue (like `process(std::move(x))` or `process(MyString("temp"))`). But once inside the function body, `s` becomes a named variable — it exists across multiple statements, and the compiler can't assume it's only used once. So the rule that "if it has a name, it's an lvalue" still applies. + +This leads to a practical consequence: **inside a function, if you want to move resources from an rvalue reference parameter, you must explicitly use `std::move`**. And once you move from it, the value of that parameter in subsequent code becomes unpredictable — this is the moved-from state we'll discuss in the next section. + +## Implicitly Movable Return Expressions + +The good news is that the "if it has a name, it's an lvalue" rule has an important exception — the `return` statement. + +```cpp +MyString make_greeting() +{ + MyString temp("hello world"); + // ... 对 temp 做一些操作 ... + return temp; // 不需要 std::move! +} +``` + +In this code, although `temp` has a name (which would normally make it an lvalue), `return temp;` is the last use of `temp` in the function. The compiler knows that `temp`'s lifetime ends immediately after the function returns, so the standard allows it to treat `temp` as an implicitly movable entity . + +This means you **do not** need to write `return std::move(temp);`. Simply writing `return temp;` is enough — the compiler will automatically select the move constructor (or, even better, eliminate the construction entirely, which we'll cover right below). + +## NRVO: An Optimization Better Than Moving + +Talking about "implicitly movable" actually isn't the end of the story. The compiler can actually do better than moving — it can deliver the return value to the caller at **zero cost**, without even needing a move. This is what's called **Named Return Value Optimization (NRVO)**. + +```cpp +MyString make_greeting() +{ + MyString temp("hello world"); + return temp; +} + +MyString s = make_greeting(); +``` + +In a world without NRVO, the execution flow would be: first construct `temp` on `make_greeting`'s stack frame, then construct a temporary object at `s`'s location (via move or copy), then destruct `temp`, then move or copy the temporary into `s`, and finally destruct the temporary. Just hearing about it sounds wasteful. + +NRVO's approach is very clever: when generating code, the compiler directly constructs `temp` at `s`'s location. Instead of constructing first and then copying, it puts the object in the right place from the very beginning. `temp` is `s`; they share the same memory. When the function returns, no copy or move is needed — the object is already where it should be. + +Starting from C++17, this optimization became **mandatory** in certain scenarios — the compiler must eliminate the copy, rather than "can eliminate it but doesn't have to." This isn't an optional optimization anymore; it's a defined behavior of the language. For historical reasons it's still called an "optimization," but it's actually a guarantee. + +For the complete technical details of NRVO and RVO, we previously had a dedicated article in vol2: [RVO and NRVO: Compiler Return Value Optimization](../../../vol2-modern-features/ch00-move-semantics/03-rvo-nrvo.md). + +## Never Use std::move on Return Values + +This is probably the most common mistake I've seen related to move semantics. As mentioned earlier, `return temp;` is implicitly movable, so the compiler will either perform NRVO (zero cost) or automatically fall back to move construction (the cost of one pointer assignment). Some people might think: since `std::move` "requests a move," wouldn't `return std::move(temp);` be more explicit and safer? + +**Exactly the opposite.** + +```cpp +// 正确写法:允许 NRVO +MyString make_good() +{ + MyString temp("good"); + return temp; +} + +// 错误写法:阻止 NRVO! +MyString make_bad() +{ + MyString temp("bad"); + return std::move(temp); // 反而更慢! +} +``` + +The reason lies in NRVO's trigger conditions : the `return` expression must be the name of a local object. When you write `return std::move(temp);`, the return expression is no longer the name `temp` — it's `std::move(temp)`, a function call expression. The compiler cannot perform NRVO on this expression and can only fall back to choosing move construction. + +In other words, `return std::move(temp);` forces the compiler down the move construction path, while `return temp;` gives the compiler the opportunity to take the NRVO path (zero cost). This is why Ben Saks repeatedly emphasized in his talk: **never use `std::move` on a return value**. + +We can use the compiler flag `-fno-elide-constructors` to compare the difference between the two. This flag disables GCC's copy elision optimization, letting us see what the world looks like "without NRVO." + +First, let's look at `return temp;`'s behavior with elision disabled — it falls back to move construction, because `temp` is implicitly movable. And `return std::move(temp);` is also move construction — there's no difference between the two when elision is disabled. But once elision is enabled (the default behavior), `return temp;` becomes a no-op, while `return std::move(temp);` is still a move construction. That's where the difference lies. + +I tested this with GCC 16.1.1, adding print logs to `MyString`'s various constructors, and the comparison results are as follows: + +```bash +# 默认开启 NRVO +$ g++ -std=c++20 -O2 test.cpp && ./a.out +=== return temp; (NRVO) === + 构造: "hello" # 只有这一次构造,没有移动,没有拷贝 + +=== return std::move(temp); === + 构造: "hello" + 移动构造: "hello" # 多了一次移动构造! + 析构: "(null)" +``` + +See? `return std::move(temp);` clearly has one extra move construction. For a class like `MyString` that only has a pointer and an integer, the cost of a move construction is very low (one pointer assignment), but for more complex classes (like objects containing multiple dynamic containers), the cost of this extra move cannot be ignored. + +```bash +# 关闭 NRVO 后对比 +$ g++ -std=c++20 -O2 -fno-elide-constructors test.cpp && ./a.out +=== return temp; === + 构造: "hello" + 移动构造: "hello" # 没有 NRVO,退回到移动构造 + 析构: "(null)" + +=== return std::move(temp); === + 构造: "hello" + 移动构造: "hello" # 同样是移动构造 + 析构: "(null)" +``` + +With NRVO disabled, both indeed behave identically — both perform one move construction. But this precisely shows that `return std::move(temp);` wastes the NRVO opportunity for free under default settings. + +:::warning C++20/C++23 Further Expand the Scope of "Implicitly Movable" +The rule discussed in this section — "don't use `std::move` on return values" — holds true across **all standard versions (C++11 through C++26)** and is absolutely safe advice. However, the "implicitly movable" mechanism itself has been continuously strengthened in subsequent standards, and it's worth knowing about: C++11 introduced the initial implicit move (when returning a local object, the compiler can treat it as a move); C++20 (proposal P1825, "More implicit moves") expanded the scope of "implicitly movable entities" — for example, local variables bound to rvalue references, and `throw` a local object, were also brought into implicit move territory; C++23 (proposal P2266) further refined this, making return values treated as xvalues in certain scenarios, covering more construction paths. + +But no matter how these extensions change, **the iron rule of "don't write `std::move` when returning a local object" has never changed** — P1825/P2266 expand the scope of "what the compiler can automatically move," while `std::move` actually breaks NRVO's trigger conditions. The conclusion remains the same: write `return temp;`, and leave the choice between NRVO and implicit move to the compiler. +::: + +## Moved-From State: Valid but Unspecified + +After a move operation is complete, the source object is in a state that the standard calls **"valid but unspecified state"** . These words are worth breaking down one by one. + +"Valid" means: no memory leaks, no resource leaks, no undefined behavior (UB). You can safely let this object destruct — its destructor will execute normally, there will be no double free, and it won't crash. For our `MyString`, after moving, `actual_str_` is set to `nullptr`, and `stored_length_` becomes 0, so `delete[] nullptr` does nothing during destruction. + +"Unspecified" means: you cannot make any assumptions about the values held by the moved-from object. The standard doesn't mandate that a moved-from `std::string` must be an empty string, nor does it mandate that a moved-from `std::vector` must be empty. Different standard library implementations may have different behaviors. Our own `MyString` returns `"(null)"` after moving (this is our own safety fallback), but a moved-from `std::string` might return an empty string or it might return the original value — you can't rely on it. + +```cpp +MyString a("hello"); +MyString b(std::move(a)); + +// 安全操作: +// 1. 析构 —— 永远安全 +// 2. 赋新值 —— 永远安全 +a = MyString("new value"); // OK + +// 不安全操作: +// 1. 假设 a 仍持有 "hello" +// 2. 假设 a.size() 是 0 +// 3. 假设 a.c_str() 返回空串 +// 这些假设在某些实现上可能碰巧成立,但标准不保证 +``` + +:::warning Usage Restrictions on Moved-From Objects +When Ben Saks was asked in the Q&A session "can a moved-from object still be used," his answer was very straightforward: **after a move, the only thing you should do with the source object is assign a new value to it or let it destruct**. Any other operation (reading values, comparing, passing to other functions) is a gamble — you might win (the implementation happens to give you a predictable value), or you might lose (the implementation changes or you switch to a different standard library). Don't gamble. + +Don't confuse "valid" with "useful" — a moved-from object is a legitimate object, but not one with determined contents. If you need an empty object, create one explicitly; if you need a specific value, assign it explicitly. Don't count on the move operation to do these things for you. +::: + +## The Importance of noexcept: The Hidden Trap in Vector Reallocation + +Finally, let's discuss an issue that is often overlooked in real-world engineering but has a massive impact: **move constructors should be `noexcept`**. + +Why? Let's look at the `std::vector` reallocation scenario. When `vector`'s capacity is insufficient, it needs to allocate a larger block of memory and then transfer the old elements to the new memory. If the element's move constructor is `noexcept`, `vector` will use moving to transfer them — very fast. If the move constructor is not `noexcept`, `vector` will fall back to copying . + +This is because `vector` needs to provide a strong exception safety guarantee: if an exception is thrown during reallocation, `vector`'s state must be rolled back to before the reallocation. If moving is used, once an exception is thrown midway, the already-moved elements cannot be restored (their resources have already been stolen). If copying is used, the original data is still there, and a safe rollback is possible. + +Let's write a simple test to verify this behavior: + +```cpp +#include +#include +#include + +class StringNoNoexcept +{ + std::size_t len_; + char* str_; + +public: + StringNoNoexcept(const char* s) + : len_(std::strlen(s)) + , str_(new char[len_ + 1]) + { + std::memcpy(str_, s, len_ + 1); + std::cout << " ctor: " << str_ << "\n"; + } + + ~StringNoNoexcept() + { + delete[] str_; + } + + StringNoNoexcept(const StringNoNoexcept& o) + : len_(o.len_) + , str_(new char[o.len_ + 1]) + { + std::memcpy(str_, o.str_, len_ + 1); + std::cout << " COPY ctor: " << str_ << "\n"; + } + + // 没有 noexcept! + StringNoNoexcept(StringNoNoexcept&& o) + : len_(o.len_) + , str_(o.str_) + { + o.str_ = nullptr; + o.len_ = 0; + std::cout << " MOVE ctor: " << (str_ ? str_ : "(null)") << "\n"; + } + + const char* c_str() const { return str_ ? str_ : "(null)"; } +}; + +int main() +{ + std::vector vec; + vec.reserve(2); + + std::cout << "=== push 3 elements (triggers reallocation) ===\n"; + vec.emplace_back("AAA"); + vec.emplace_back("BBB"); + vec.emplace_back("CCC"); // 这里触发扩容 + + std::cout << "\n=== final contents ===\n"; + for (const auto& s : vec) { + std::cout << " " << s.c_str() << "\n"; + } + return 0; +} +``` + +After compiling and running, you'll see output like this (GCC 16.1.1, `-std=c++20 -O2`): + +```bash +$ g++ -std=c++20 -O2 test_noexcept.cpp && ./a.out +=== push 3 elements (triggers reallocation) === + ctor: AAA + ctor: BBB + ctor: CCC + COPY ctor: AAA # 扩容时用的是拷贝!不是移动! + COPY ctor: BBB +``` + +See that? When the third element triggers reallocation, `vector` **copies** the first two elements to the new memory — even though we clearly implemented a move constructor. The reason is that our move constructor isn't marked `noexcept`. + +Now let's add `noexcept` to the move constructor: + +```cpp +StringNoNoexcept(StringNoNoexcept&& o) noexcept // 加上 noexcept +``` + +Recompile and run: + +```bash +$ g++ -std=c++20 -O2 test_noexcept.cpp && ./a.out +=== push 3 elements (triggers reallocation) === + ctor: AAA + ctor: BBB + ctor: CCC + MOVE ctor: AAA # 现在用移动了! + MOVE ctor: BBB +``` + +The difference of a single `noexcept` keyword directly determines whether `vector` uses copy or move during reallocation. For a class that holds dynamic memory, in scenarios with large amounts of data, this difference can mean an order-of-magnitude performance gap. + +This is a genuine production-level trap. Many people write move constructors but forget to add `noexcept`, and then are puzzled in performance testing about "why move semantics aren't taking effect." The answer often lies in those two words. + +## The Complete MyString: The Rule of Five Assembled + +Combining the content of this article with the previous two, we get a complete, Rule of Five-compliant `MyString` implementation: + +```cpp +#include +#include + +class MyString +{ + std::size_t stored_length_; + char* actual_str_; + +public: + // 构造函数 + explicit MyString(const char* s = "") + : stored_length_(std::strlen(s)) + , actual_str_(new char[stored_length_ + 1]) + { + std::memcpy(actual_str_, s, stored_length_ + 1); + } + + // 析构函数 + ~MyString() + { + delete[] actual_str_; + } + + // 拷贝构造函数 + MyString(const MyString& other) + : stored_length_(other.stored_length_) + , actual_str_(new char[other.stored_length_ + 1]) + { + std::memcpy(actual_str_, other.actual_str_, stored_length_ + 1); + } + + // 移动构造函数 —— noexcept! + MyString(MyString&& s) noexcept + : stored_length_(s.stored_length_) + , actual_str_(s.actual_str_) + { + s.actual_str_ = nullptr; + s.stored_length_ = 0; + } + + // 拷贝赋值运算符 + MyString& operator=(const MyString& other) + { + if (this != &other) { + delete[] actual_str_; + stored_length_ = other.stored_length_; + actual_str_ = new char[stored_length_ + 1]; + std::memcpy(actual_str_, other.actual_str_, stored_length_ + 1); + } + return *this; + } + + // 移动赋值运算符 —— noexcept! + MyString& operator=(MyString&& s) noexcept + { + if (this != &s) { + delete[] actual_str_; + stored_length_ = s.stored_length_; + actual_str_ = s.actual_str_; + s.actual_str_ = nullptr; + s.stored_length_ = 0; + } + return *this; + } + + const char* c_str() const { return actual_str_ ? actual_str_ : "(null)"; } + std::size_t size() const { return stored_length_; } +}; +``` + +All five special member functions — destructor, copy constructor, copy assignment, move constructor, and move assignment — are present and accounted for. This is the so-called Rule of Five: if you need to customize any one of them, you most likely need to customize all five. The compiler-generated default versions are unsafe for classes that hold raw pointers. + +## What We've Cleared Up + +Across three articles, we started from the three deep copies of `swap`, went through the value category system of lvalues and rvalues, and finally in this article broke down all the implementation details of move operations. Let me use a concise checklist to review the core points of this article. + +The core of a move constructor is "destructive copy" — steal the source object's resource pointer, then set the source object to a harmless state. Overload resolution automatically selects between copy and move; you don't need to make extra judgments at the call site. `std::move` doesn't move anything; it's simply a cast to an rvalue reference that enables overload resolution to select the move version. An rvalue reference parameter is an lvalue inside a function — because it has a name — so you still need `std::move` to move from it. The `return` statement is an exception to the "if it has a name, it's an lvalue" rule; the compiler automatically identifies implicitly movable return expressions. NRVO can deliver return values to the caller at zero cost — and `return std::move(temp)` prevents NRVO, so never write it that way. A moved-from object is in a "valid but unspecified" state; the only safe operations are assigning a new value or destructing it. Move constructors must be marked `noexcept` — otherwise `std::vector` will fall back to copying during reallocation, and the performance gap can be enormous. + +If you want to dive deeper into more application scenarios of move semantics — perfect forwarding, universal references, reference collapsing — check out vol2's [Perfect Forwarding: Preserving Exact Value Category Propagation](../../../vol2-modern-features/ch00-move-semantics/04-perfect-forwarding.md). Move semantics combined with perfect forwarding form the complete foundation of modern C++ template programming. + + + + + + + + + + diff --git a/documents/en/vol10-open-lecture-notes/cppcon/2025/04-back-to-basics-move-semantics/index.md b/documents/en/vol10-open-lecture-notes/cppcon/2025/04-back-to-basics-move-semantics/index.md new file mode 100644 index 000000000..809424978 --- /dev/null +++ b/documents/en/vol10-open-lecture-notes/cppcon/2025/04-back-to-basics-move-semantics/index.md @@ -0,0 +1,43 @@ +--- +title: 'Back to Basics: Move Semantics' +description: 'CppCon 2025 Talk Notes — Ben Saks: A Gentle Introduction to C++ Move + Semantics' +conference: cppcon +conference_year: 2025 +talk_title: 'Back to Basics: Move Semantics' +speaker: Ben Saks +video_bilibili: https://www.bilibili.com/video/BV1X54y1P7uM +video_youtube: https://www.youtube.com/watch?v=szU5b972F7E +tags: +- cpp-modern +- host +- beginner +difficulty: beginner +platform: host +cpp_standard: +- 11 +- 17 +- 20 +translation: + source: documents/vol10-open-lecture-notes/cppcon/2025/04-back-to-basics-move-semantics/index.md + source_hash: 664132f48c63ae9c48a0d98bbf5fa6f11ec9e73f36d8d2deebe14c224899a385 + translated_at: '2026-06-13T02:18:23.227621+00:00' + engine: anthropic + token_count: 251 +--- + + +## Notes + + + The Cost of Copying and the Motivation for Moving: From swap to MyString + Lvalues, Rvalues, and References: The Type System Foundation of Move Semantics + Move Operations, std::move, and Copy Elision + diff --git a/documents/en/vol10-open-lecture-notes/cppcon/2025/index.md b/documents/en/vol10-open-lecture-notes/cppcon/2025/index.md index 8dc89645c..e0c1ea785 100644 --- a/documents/en/vol10-open-lecture-notes/cppcon/2025/index.md +++ b/documents/en/vol10-open-lecture-notes/cppcon/2025/index.md @@ -43,3 +43,32 @@ A collection of notes from CppCon 2025 talks. C++: Exploring the Underlying Assembly + +--- + + + + + Back to Basics: C++ Ranges + + +--- + + + + + Back to Basics: Move Semantics + diff --git a/documents/vol10-open-lecture-notes/cppcon/2025/03-back-to-basics-ranges/01-from-loops-to-iterators.md b/documents/vol10-open-lecture-notes/cppcon/2025/03-back-to-basics-ranges/01-from-loops-to-iterators.md new file mode 100644 index 000000000..01b284fe2 --- /dev/null +++ b/documents/vol10-open-lecture-notes/cppcon/2025/03-back-to-basics-ranges/01-from-loops-to-iterators.md @@ -0,0 +1,409 @@ +--- +title: "从循环到迭代器:遍历数据的抽象之路" +description: "CppCon 2025 演讲笔记 —— Mike Shah:从 for 循环、指针遍历到迭代器抽象,补全迭代器类别体系并用 GCC 16.1.1 实测 legacy tag 与 C++20 concept 的差异" +conference: cppcon +conference_year: 2025 +talk_title: "Back to Basics: C++ Ranges" +speaker: "Mike Shah" +video_youtube: "https://www.youtube.com/watch?v=Q434UHWRzI0" +tags: + - cpp-modern + - host + - beginner + - Ranges + - 容器 +difficulty: beginner +platform: host +cpp_standard: [11, 17, 20] +chapter: 3 +order: 1 +--- + +# 从循环到迭代器:遍历数据的抽象之路 + +:::tip +本文基于 CppCon 2025 Mike Shah 的 "Back to Basics: C++ Ranges" 做深度二创。上面是 YouTube 链接。本系列计划拆成三篇:本篇讲清楚「遍历数据」这条线(循环 → 指针 → 迭代器 → range-based for),第二篇讲 STL 算法与迭代器陷阱,第三篇才正式进入 Ranges、Views 与管道组合。实验环境为 Arch Linux WSL,GCC 16.1.1,编译选项 `-std=c++20`。 +::: + +Mike Shah 在演讲开场抛了一句很朴素、但我越想越觉得有道理的话:**算法(algorithm)本质上就是循环**。他说自己读研究生时读到一篇 2012 年做算法性能经验评估的论文,得到的启发是——面对一个陌生代码库,想搞清楚「计算到底发生在哪里」,最快的办法就是去找程序里的循环。因为咱们作为工程师,一半的工作是**转换数据**,另一半是**存储数据**,而循环就是「转换数据」这件事最直接的载体。 + +:::warning 这里要给 Shah 老师的话打个折 +「算法 = 循环」是他自己反复强调的"极度简化"(a gross oversimplification),咱们听个意思就行。严格来说,算法是求解问题的有限步骤序列——递归算法、并行算法(``)、协程式的算法,都不一定长着 `for` 的样子。循环只是最常见的载体之一。但作为理解 STL 和 Ranges 的切入点,这个简化是好用的:**先把循环看懂,再看 STL 怎么把循环抽象掉。** +::: + +这一篇我们就从最原始的下标循环开始,一步步看 C++ 是怎么把「遍历数据」这件事一层层抽象出来的。我们的终点不是 Ranges(那是第三篇),而是**迭代器**——它是连接「循环」和「算法」的那座桥。 + +先把实验环境摆出来,后面所有输出都基于它: + +```bash +❯ g++ --version +g++ (GCC) 16.1.1 20260430 + +❯ uname -sr +Linux 6.18.33.1-microsoft-standard-WSL2 +``` + +## 最朴素的遍历:下标 for 循环 + +一切从这里开始。假设我们有一串字符要逐个打印,绝大多数人下意识写出来的就是三段式的 `for`: + +```cpp +#include +#include + +int main() +{ + std::array message{'H', 'e', 'l', 'l', 'o'}; + + for (std::size_t i = 0; i < message.size(); ++i) { + std::cout << message[i]; + } + std::cout << '\n'; +} +``` + +这段代码里其实藏着两个隐含的假设,只是我们用得太顺手、不会去想。第一,它假设容器支持 `operator[]` 下标访问;第二,它假设容器自己知道自己的 `size()`。`std::array`、`std::vector`、`std::string` 都满足这两条,所以跑起来没问题。但只要换成 `std::list` 或 `std::set`——它们没有下标访问——这段代码就编不过了。同一份「遍历」的逻辑,换个容器就得重写,这正是抽象不够的信号。 + +不过先别急着抽象,下标循环该不该用、什么时候用,是个有讲究的问题,但不是这里的重点。我们关心的是:**它表达了「遍历」这件事,但它把遍历和「容器恰好是连续存储、恰好支持下标」这两件事绑死了。** 我们想把前者单独拎出来。 + +## 换个视角:用指针遍历 + +Shah 在幻灯片上换了一种写法,当时我愣了一下——这居然也行?他不用下标,而是拿到数组的首地址,然后用指针去走: + +```cpp +char* begin = message.data(); +char* end = message.data() + message.size(); +for (char* p = begin; p != end; ++p) { + std::cout << *p; +} +``` + +这里的 `data()` 返回底层数组的首地址,`end` 就是首地址加上元素个数——指针加法。然后循环体里 `*p` 解引用、`++p` 前进一步。运行结果和下标版本一模一样,但视角完全不同了:**我们不再依赖「下标」这个抽象,而是直接操作「地址」。** + +为什么要换这个视角?Shah 的动机很直接——**泛化**。下标假设了「连续存储 + 随机访问」,但现实里很多数据结构不是连续的:链表、树、图。一棵二叉树你怎么 `tree[i]`?你没法用一个整数去索引它。但「从某个起点出发,一步步走到下一个元素」这件事,是所有数据结构遍历的共同内核。指针 `++` 只是最简单的一种「走到下一个」的实现。 + +:::tip 顺带说一句 STL 的来历 +把「递增指针」这件事抽象成一个可替换的对象,是 90 年代 Alexander Stepanov 和 Meng Lee 在惠普(HP)实验室完成的工作——这就是 STL 的原型,1993–94 年提交给委员会,后来并入 C++98 标准。迭代器从一开始就是为了「把算法和数据结构解耦」而生的,不是后来拍脑袋加的。 +::: + +## 迭代器:指针的泛化 + +既然「走到下一个元素」可以有不同的实现,那我们干脆把它抽象成一个类型——这就是**迭代器(iterator)**。cppreference 上对迭代器的第一句话就是:**「迭代器是指针的泛化(a generalization of pointers)」**。 + +我们用 `std::begin` 和 `std::end` 这对自由函数来获取容器首尾的迭代器: + +```cpp +for (auto it = std::begin(message); it != std::end(message); ++it) { + std::cout << *it; +} +``` + +你看,和指针版本的写法几乎一模一样——`begin`、`end`、`!=`、`++`、`*`。区别只在于 `it` 的类型不再是 `char*`,而是一个「表现得像指针」的对象。换成 `std::list`、`std::set`,这段代码一个字都不用改就能跑(只要它们的迭代器支持这些操作)。抽象在这里开始回报我们了。 + +这里有两个细节值得停一下。第一个是 `begin()` 指向第一个元素,而 `end()` 指向**最后一个元素的下一个位置**(one-past-the-end),它本身不可解引用。这个半开区间 `[begin, end)` 的约定不是随便定的:**它让「空容器」的判断变得极其自然**——空容器就是 `begin == end`,循环条件直接为假,根本不用特判。如果 `end` 指向最后一个元素本身,那空容器就没有「最后一个元素」,处理起来就别扭了。 + +第二个细节是 `std::begin` / `std::end` 这种**自由函数**形式,和容器的 `.begin()` / `.end()` **成员函数**形式的区别。 + +:::warning Shah 这里说得不够准 +Shah 在演讲里说「只有部分容器拥有 `.begin()`、`.end()`,但并非所有容器都有,所以自由函数更通用」——这个说法其实**不准确**。事实是:**所有 STL 容器都有 `.begin()` / `.end()` 成员函数**,没有例外。 + +自由函数 `std::begin` / `std::end` 真正的价值在三件事上:一是对**原生数组**(比如 `int arr[5]`)做了重载——数组没有成员函数,只能靠自由函数拿到首尾指针;二是让**泛型代码**写起来更统一(模板里不用区分「这是容器还是数组」);三是 C++20 的 `std::ranges::begin` 还能处理哨兵(sentinel)和代理类型(比如 `vector`)。所以更准确的说法是:**自由函数对内置数组和自定义类型更统一,而不是「有些容器没有成员函数」。** +::: + +## 迭代器类别体系:不是所有迭代器都一样能干 + +到这一步,Shah 在演讲里直接说了一句「迭代器的类别我就不展开讲了」,然后跳过去了。但这恰恰是新手最容易栽跟头的地方,我们这篇既然是二创,就把它补齐——这也是本篇的**重头戏**。 + +不是所有迭代器能力都一样。`std::vector` 的迭代器能 `it + 5` 一下跳五格,但 `std::list` 的迭代器不行,它只能 `++` 一步步走。标准把迭代器按能力分成了几个**类别(category)**,从弱到强大致是:输入(input)→ 前向(forward)→ 双向(bidirectional)→ 随机访问(random access)→ 连续(contiguous,C++20 新增)。 + +关键问题是:**你怎么知道某个迭代器属于哪个类别?** C++20 之前,靠的是一个叫 `std::iterator_traits::iterator_category` 的类型特征(一个 tag 类型);C++20 之后,改成了一组**概念(concepts)**,比如 `std::random_access_iterator`、`std::contiguous_iterator`。这两套东西在 C++20 里并存,但它们对同一个迭代器可能给出**不同**的答案——这背后藏着一个非常重要的演进。 + +我写了个小程序,用 GCC 16.1.1 把常见容器的两套结果都打出来: + +```cpp +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +// 旧的 C++98 风格:从 iterator_traits 取 tag +template +const char* legacy_tag() +{ + using cat = typename std::iterator_traits::iterator_category; + if constexpr (std::is_same_v) return "contiguous"; + else if constexpr (std::is_same_v) return "random_access"; + else if constexpr (std::is_same_v) return "bidirectional"; + else if constexpr (std::is_same_v) return "forward"; + else if constexpr (std::is_same_v) return "input"; + else return "?"; +} + +// 新的 C++20 风格:用 concept 探测 +template +const char* cpp20_concept() +{ + if constexpr (std::contiguous_iterator) return "contiguous_iterator"; + else if constexpr (std::random_access_iterator) return "random_access_iterator"; + else if constexpr (std::bidirectional_iterator) return "bidirectional_iterator"; + else if constexpr (std::forward_iterator) return "forward_iterator"; + else if constexpr (std::input_iterator) return "input_iterator"; + else return "(none)"; +} + +template +void row(const char* name) +{ + std::printf("%-26s legacy_category=%-15s cpp20_concept=%s\n", + name, legacy_tag(), cpp20_concept()); +} + +int main() +{ + row::iterator>("std::array"); + row::iterator>("std::vector"); + row("std::string"); + row::iterator>("std::deque"); + row::iterator>("std::list"); + row::iterator>("std::forward_list"); + row::iterator>("std::set"); + row::iterator>("std::map"); + row("int* (raw pointer)"); + + static_assert(std::contiguous_iterator); + static_assert(std::random_access_iterator::iterator>); + static_assert(!std::contiguous_iterator::iterator>); + static_assert(!std::random_access_iterator::iterator>); + std::printf("static_assert checks: PASS\n"); +} +``` + +编译运行: + +```bash +❯ g++ -std=c++20 -O2 -Wall iter.cpp -o iter && ./iter +std::array legacy_category=random_access cpp20_concept=contiguous_iterator +std::vector legacy_category=random_access cpp20_concept=contiguous_iterator +std::string legacy_category=random_access cpp20_concept=contiguous_iterator +std::deque legacy_category=random_access cpp20_concept=random_access_iterator +std::list legacy_category=bidirectional cpp20_concept=bidirectional_iterator +std::forward_list legacy_category=forward cpp20_concept=forward_iterator +std::set legacy_category=bidirectional cpp20_concept=bidirectional_iterator +std::map legacy_category=bidirectional cpp20_concept=bidirectional_iterator +int* (raw pointer) legacy_category=random_access cpp20_concept=contiguous_iterator +static_assert checks: PASS +``` + +看出门道了吗?**最有意思的就是前几行和最后一行**。`std::array`、`std::vector`、`std::string`,还有裸指针 `int*`——它们的旧 tag 全是 `random_access`,但 C++20 concept 探测出来却是 `contiguous_iterator`。 + +这就是问题所在:**旧的 tag 体系里,根本就没有 `contiguous`(连续)这个档位**(`contiguous_iterator_tag` 是 C++20 才加进去的)。在 C++20 之前,`int*` 的 `iterator_category` 只能被标成 `random_access`,没法表达「这块内存不仅是随机可访问的、而且是在物理上连续存储的」这个更强的性质。这个区分为什么重要?因为「连续存储」意味着你可以安全地把迭代器 underlying 的数据当成一块连续内存喂给 C 接口(比如 `memcpy`、CUDA kernel、或者 SIMD 指令)——而 `std::deque` 虽然也支持 `it + 5`,但它内部是一段一段的分块存储,**不连续**,所以它的 concept 是 `random_access_iterator` 而不是 `contiguous`。 + +:::tip 这就是 concepts 比 tag 强的地方 +旧 tag 是个继承链(`random_access_iterator_tag` 继承自 `bidirectional_iterator_tag` 继承自……),表达能力有限,只能分层。C++20 的 concept 是一组**正交的、可组合的约束**,能精确说出「随机可访问」和「连续存储」是两件可以独立成立的事。这也是为什么 Ranges 整套体系必须等 C++20 的 concepts 落地才能进标准——没有 concept,很多约束根本表达不出来。关于 concepts 更系统的讲解,可以看 vol4 的相关文章,我们第三篇讲 Ranges 时也会用到。 +::: + +## 迭代器算术与 std::advance + +有了类别概念,再看迭代器的算术操作就清楚了。对随机访问迭代器,你可以直接 `it + 5`、`it - 2`、`it1 - it2`(求距离),这些都是 O(1)。但对双向或前向迭代器,`it + 5` 直接编不过——它们只认 `++` 和 `--`。 + +那如果我写的是泛型代码,想「往前走 n 步」但又不限定迭代器类别怎么办?标准库给了 `std::advance`: + +```cpp +auto it = std::begin(message); +auto last = std::end(message); +std::ptrdiff_t available = std::distance(it, last); +if (5 < available) { + std::advance(it, 5); // 安全:确认走得到 +} +``` + +`std::advance` 的妙处在于它会根据迭代器类别**自动选实现**:传给它 `vector::iterator`,它走的是 `it + n`(O(1));传给它 `list::iterator`,它退化成 n 次 `++`(O(n))。同一个调用接口,背后是不同的算法复杂度——这就是泛型编程的甜头。 + +:::warning advance 不做边界检查 +但有一点必须提醒:**`std::advance` 自己不检查边界**。如果你让它往前走 100 步,而容器里只有 5 个元素,它不会报错,而是直接越界——解引用就是段错误(UB)。所以上面那段代码我才先用 `std::distance` 算了剩余长度、做了判断。实战里如果想要带边界检查的迭代器,GCC/Clang 可以加 `-D_GLIBCXX_DEBUG` 编译宏,让标准库的迭代器在调试模式下带上下界检测——下一篇我们会用它抓一个真实的越界 bug。MSVC 那边对应的是 `_ITERATOR_DEBUG_LEVEL=2`。 +::: + +## range-based for:循环的语法糖 + +讲了半天迭代器,回到日常写代码——我们绝大多数时候并不会手写 `for (auto it = begin; it != end; ++it)`,而是用 C++11 给的**范围 for 循环(range-based for)**: + +```cpp +for (char c : message) { + std::cout << c; +} +``` + +干净、不容易写错、不用操心 `end`。但这个语法糖背后到底是什么?其实它就是上面手写迭代器循环的等价改写。按标准规定,它大致等价于: + +```cpp +{ + auto&& __range = message; + auto __begin = std::begin(__range); // 或 __range.begin() + auto __end = std::end(__range); // 或 __range.end() + for (; __begin != __end; ++__begin) { + char c = *__begin; + std::cout << c; // 你的循环体 + } +} +``` + +这就解释了一个常见的疑惑:**range-based for 是怎么知道去调 `begin`/`end` 的?** 答案是编译器在背后帮你插了这两句。它先拿 `__range`,再取首尾迭代器,然后就是普通迭代器循环。所以 range-based for 对迭代器类别没有任何额外要求——只要你的类型能提供 `begin`/`end`(成员或自由函数都行),它就能用。这也是为什么后面我们自定义类型只要实现这两个函数,就能直接塞进 range-based for。 + +如果遍历的是 `std::map` 这种键值对容器,C++17 的**结构化绑定(structured binding)**配合 range-based for 会非常顺手: + +```cpp +const std::map scores{ + {"alice", 90}, {"bob", 85} +}; + +for (const auto& [name, score] : scores) { + std::cout << name << ": " << score << '\n'; +} +``` + +:::warning 给结构化绑定补个版本号 +Shah 在演讲里用到了结构化绑定,但**没标它是哪个标准的特性**——这里补一下:**结构化绑定是 C++17(提案 P0217)引入的**。如果你的工程还在 C++14,这段代码编不过。 + +另外 Shah 提到一句「省略号语法可以进一步解包」,这个表述其实有点含糊。结构化绑定本身并不支持变长解包(它绑定的元素个数是固定的,得和右侧类型的成员数对上);省略号在 C++ 里属于模板参数包展开(pack expansion)和折叠表达式(fold expression)的语境,和结构化绑定不是一回事。建议把这句当成口误,别深究。 +::: + +## 实验:range-based for 和手写循环,编译出来一样吗 + +每次跟人讲「range-based for 只是语法糖」,总会有人将信将疑——那几个 `__range`、`__begin`、`__end` 临时变量,会不会拖慢性能?我们来实测。我把同一个「求和」用四种写法写出来: + +```cpp +#include + +int sum_index(const std::vector& v) +{ + int s = 0; + for (std::size_t i = 0; i < v.size(); ++i) s += v[i]; + return s; +} + +int sum_ptr(const std::vector& v) +{ + int s = 0; + for (const int* p = v.data(), *e = p + v.size(); p != e; ++p) s += *p; + return s; +} + +int sum_iter(const std::vector& v) +{ + int s = 0; + for (auto it = v.begin(), e = v.end(); it != e; ++it) s += *it; + return s; +} + +int sum_rangefor(const std::vector& v) +{ + int s = 0; + for (int x : v) s += x; + return s; +} +``` + +然后开 `-O2` 让编译器生成汇编: + +```bash +❯ g++ -std=c++20 -O2 -S codegen.cpp -o codegen.s +``` + +去 `.s` 文件里翻这四个函数的热循环,你会发现它们清一色长成这样(以 `sum_rangefor` 为例): + +```asm +.L19: + addl (%rax), %edx ; s += *p + addq $4, %rax ; p++ (int 占 4 字节) + cmpq %rcx, %rax ; p == e ? + jne .L19 ; 不等就继续 +``` + +四种写法生成的循环体**字节级几乎一致**——编译器在 `-O2` 下把那些临时变量、下标计算、指针算术全都归约成了同一段 `add / cmp / jne`。也就是说,**range-based for 在优化开启后没有任何额外开销**,你可以放心地为了可读性用它。代价只有在 `-O0`(不优化)时才显现:那几个 `__begin`/`__end` 临时量会老老实实存在栈上,但谁会在 `-O0` 下追求性能呢。 + +:::tip 一个 C++17 才修好的小坑 +顺带提一句 range-based for 本身的历史:它是 C++11(提案 N2930)进的标准。但 C++11 那版的展开规则有个毛病——它会把 `__end` 每次循环都重新求值(或者说,对 `.end()` 的缓存策略对某些代理类型不友好)。C++17(提案 P0184)专门修了这个,让 `__end` 在循环开始时只求值一次。所以你今天用的 range-based for,是 C++17 修订过的版本,更稳。这也提醒我们:能用新标准就尽量用新标准,很多「语法糖」在后续版本里被悄悄打磨过。 +::: + +## 一对迭代器,就是一个 range + +到这里我们可以给「遍历」画一条完整的线了:**一个起点迭代器 `begin`,加一个终点标记 `end`,中间用 `++` 一步步走**——这一对迭代器,就定义了一段可遍历的数据。标准库管这种「一对迭代器」叫一个 **range**。 + +这个概念重要在哪里?因为它把「数据在哪里」和「怎么处理数据」彻底解耦了。我写一个求和函数,只要它能接收一对迭代器,那它对 `vector`、`list`、`set`、甚至你自己手写的链表,统统适用——只要这些容器能提供符合要求的迭代器。算法不再绑死在某种具体容器上。 + +而迭代器这个抽象本身,其实是经典的设计模式——**迭代器模式(Iterator pattern)**,属于 GoF《Design Patterns》里的行为型模式。它的核心思想就是「提供一种方法,顺序访问一个聚合对象中的元素,而又不暴露该对象的内部表示」。C++ 把它做成了语言级的设施(`begin`/`end`/`operator++`/`operator*` 的约定),让任何类型只要遵守这个约定,就能接入整套 STL 算法生态。 + +这个「一对迭代器即 range」的定义,正是第三篇我们要讲的 `std::ranges::range` concept 的前身。区别在于,C++20 的 range 概念允许 `end` 返回一个**和 `begin` 不同类型**的哨兵(sentinel)——这会解锁一些很有意思的能力(比如遍历以 `'\0'` 结尾的 C 字符串时,不用先算长度)。这个我们留到第三篇展开。 + +## 到这里我们搞清楚了什么 + +我们从最原始的下标 `for` 出发,看到了「遍历」这件事如何一步步被抽象:下标循环把遍历和「连续存储 + 随机访问」绑死;指针遍历把它解放到「地址」层面;迭代器把它进一步抽象成「一个能 `++`、能 `*` 的对象」,从此算法和数据结构解耦。我们还补全了 Shah 跳过的迭代器类别体系,并用 GCC 16.1.1 实测了一个关键事实:**旧 tag 把 `vector`/`string`/裸指针都笼统标成 `random_access`,而 C++20 的 concept 能精确说出它们其实是更强的 `contiguous_iterator`**——这正是 concepts 比 tag 强、也是 Ranges 必须等 C++20 才能落地的原因。 + +核心就一句话:**一对迭代器(一个 `begin`、一个 `end`)定义了一个 range,而 STL 算法就建立在这对迭代器之上。** + +下一篇我们就把这对迭代器交给 STL 算法——看 `std::sort`、`std::partition`、`std::transform` 这些「循环的替代品」怎么用,以及它们对迭代器类别有什么硬性要求(比如 `std::sort` 为什么不能用在 `std::list` 上)。那里还有几个迭代器的经典陷阱等着我们:迭代器失效、配错 `begin`/`end`、参数顺序写反。如果你想先复习一下容器本身的内存布局,vol3 的 [span:不拥有数据的视图](../../../vol3-standard-library/02-span.md) 和容器相关文章是很好的前置阅读。 + + + + + + + + + + diff --git a/documents/vol10-open-lecture-notes/cppcon/2025/03-back-to-basics-ranges/02-stl-algorithms-and-iterator-pitfalls.md b/documents/vol10-open-lecture-notes/cppcon/2025/03-back-to-basics-ranges/02-stl-algorithms-and-iterator-pitfalls.md new file mode 100644 index 000000000..cb7322571 --- /dev/null +++ b/documents/vol10-open-lecture-notes/cppcon/2025/03-back-to-basics-ranges/02-stl-algorithms-and-iterator-pitfalls.md @@ -0,0 +1,408 @@ +--- +title: "STL 算法实战与迭代器陷阱" +description: "CppCon 2025 演讲笔记 —— Mike Shah:STL 算法族实战、迭代器类别硬约束,补算法速查表与失效规则表,用 GCC 实测迭代器失效的静默 UB 与 _GLIBCXX_DEBUG 捕获" +conference: cppcon +conference_year: 2025 +talk_title: "Back to Basics: C++ Ranges" +speaker: "Mike Shah" +video_youtube: "https://www.youtube.com/watch?v=Q434UHWRzI0" +tags: + - cpp-modern + - host + - beginner + - Ranges + - 容器 +difficulty: beginner +platform: host +cpp_standard: [11, 17, 20] +chapter: 3 +order: 2 +--- + +# STL 算法实战与迭代器陷阱 + +:::tip +这是 CppCon 2025 Mike Shah "Back to Basics: C++ Ranges" 系列的第二篇。上一篇我们把「遍历」从下标循环一路抽象到了迭代器,结论是:**一对 `begin`/`end` 迭代器定义了一个 range**。这一篇我们就把这对迭代器喂给 STL 算法——看它们怎么替你写循环,以及它们对迭代器有哪些硬性要求。本篇还会重点拆几个迭代器的经典陷阱,并全部用 GCC 16.1.1 实测给你看。环境同上:Arch Linux WSL,`-std=c++20`。 +::: + +上一篇结尾我们说,算法就建立在那对迭代器之上。这话要落到具体上,得先搞清楚 STL 到底由哪几块拼起来的。 + +## STL 的三大支柱 + +标准模板库(STL)的设计哲学,是把三样东西解耦开:**容器(containers)**负责存数据,**迭代器(iterators)**负责遍历数据,**算法(algorithms)**负责处理数据。三者通过迭代器这个「胶水」连接起来——算法不直接认识任何具体容器,它只认迭代器;容器只要能吐出符合要求的迭代器,就能被所有算法复用。这个解耦是 STL 能用一个 `std::sort` 通吃 `vector`、`array`、`deque` 的根本原因。 + +那算法到底在哪几个头文件里? + +:::warning Shah 说的「两个头」有点窄 +Shah 在演讲里说「算法主要在 `` 和 `` 两个头文件」——这对入门理解没问题,但实际上**漏了好几块**。完整的图景是这样的:通用算法(`sort`、`find`、`copy`、`transform` 等)在 ``;数值算法(`accumulate`、`reduce`、`inner_product` 等)在 ``;**并行算法**(带执行策略的 `sort(std::execution::par, ...)` 等)需要 ``(C++17);C++20 的 ranges 算法和 views 在 ``;甚至还有分散的——`std::midpoint` 在 ``,但 C++23 的折叠算法 `std::fold_left` 却在 ``。所以别死记「算法=两个头」,记成「算法分散在几个头里,`` 是主力」更准确。 +::: + +## 算法速查表:按类别看,每个算法要什么迭代器 + +STL 算法有上百个,硬背没意义。更好记的方式是**按类别归类**,并且记住**每个类别对迭代器类别的硬性要求**——因为这直接决定了你能不能把它用在某个容器上。下面这张表是本篇的二创重点,Shah 在演讲里没展开: + +| 分类 | 代表算法 | 所需迭代器类别 | +|------|------|------| +| 只读查找 | `find` / `find_if` / `count` / `accumulate` | input(最弱即可) | +| 修改拷贝 | `copy` / `transform` / `replace` / `fill` | forward / output | +| 分区 | `partition` / `stable_partition` | forward(稳定版需 bidirectional) | +| 排序 | `sort` / `stable_sort` / `partial_sort` | **random_access**(硬要求) | +| 二分查找 | `lower_bound` / `upper_bound` / `binary_search` | forward(**且区间必须已有序**) | +| 数值归约 | `reduce` / `transform_reduce` / `inner_product` | input | +| 堆操作 | `push_heap` / `pop_heap` / `sort_heap` | random_access | + +这里最值得记住的一条是:**排序类算法要求随机访问迭代器(random access)**。这意味着它们只能用在 `vector`、`array`、`deque` 这种连续或随机访问的容器上,**用在 `std::list` 上直接编不过**。这不是建议,是硬约束。我们来实测一下。 + +## 实验:std::sort 不能用在 std::list 上 + +`std::list` 是双向迭代器(bidirectional),不支持 `it + n`、也不支持两个迭代器相减。而 `std::sort` 内部需要随机访问(它要做 `__last - __first` 来估算递归深度)。把 list 的迭代器塞进去会怎样? + +```cpp +#include +#include + +int main() +{ + std::list l{3, 1, 2}; + std::sort(l.begin(), l.end()); // 编不过! +} +``` + +GCC 16.1.1 的报错(截取关键几行): + +```bash +❯ g++ -std=c++20 list_sort.cpp -o list_sort +/usr/include/c++/16.1.1/bits/stl_algo.h:1914:50: error: no match for ‘operator-’ + (operand types are ‘std::_List_iterator’ and ‘std::_List_iterator’) + 1914 | std::__lg(__last - __first) * 2, + | ~~~~~~~^~~~~~~~~ +``` + +看到没——错误就出在 `__last - __first` 这一步:`std::sort` 想用迭代器减法算区间长度,但 `_List_iterator` 根本没定义 `operator-`(双向迭代器只认 `++`/`--`,不认减法)。这就是「迭代器类别不满足算法要求」的典型表现。如果你确实要排序一个 `list`,用它的成员函数 `l.sort()`——那是为链表量身定制的归并排序,复杂度还是 O(n log n),但不依赖随机访问。 + +## sort、partition、copy、transform:常见算法长什么样 + +我们快速过一遍最常用的几个算法,建立直觉。它们的参数形态惊人地统一——绝大多数都是**一对迭代器 `(first, last)` 加上一个可选的谓词或目标**。 + +```cpp +#include +#include +#include +#include + +void demo(std::vector& v, const std::vector& src) +{ + // 排序整个区间 + std::sort(v.begin(), v.end()); + + // 局部排序:只排 [begin, begin+3),后面元素顺序不定但都 >= 前 3 个 + // std::partial_sort(v.begin(), v.begin() + 3, v.end()); + + // 分区:把满足谓词的元素挪到前面,返回分界点 + auto it = std::partition(v.begin(), v.end(), [](int x) { return x < 4; }); + + // 拷贝:用 back_inserter 自动 push_back,不用预先算大小 + std::copy(src.begin(), src.end(), std::back_inserter(v)); + + // 打乱:必须传一个随机数引擎(C++11 起 rand() 不推荐) + std::shuffle(v.begin(), v.end(), std::mt19937{std::random_device{}()}); +} +``` + +这里有两个细节值得多说一句。`std::back_inserter(v)` 返回的是一个**输出迭代器(output iterator)**,你往它里面写东西,它就自动调 `v.push_back()`——这就避开了「我得先知道要拷多少个、提前 reserve 好空间」的麻烦,是 `copy` 最常见的搭档。`std::shuffle` 则提醒我们:**C++11 之后,随机数应该用 `` 头里的引擎(`std::mt19937` 等),而不是老的 `rand()`**——`rand()` 质量差、还有线程安全问题。 + +再看 `std::transform`,它把「对每个元素套一个函数」这件事封装好了。注意这里用了 `cbegin`/`cend`——**const 版本的迭代器**,表示「我只读源区间,不修改它」: + +```cpp +#include +#include +#include + +std::string s = "hello"; +std::string out; +std::transform(s.cbegin(), s.cend(), std::back_inserter(out), + [](char c) { return std::toupper(static_cast(c)); }); +// out == "HELLO" +``` + +`cbegin`/`cend` 返回 `const_iterator`,`rbegin`/`rend` 返回反向迭代器。一个容易踩的点是:**这些迭代器必须成对使用**——你不能拿 `cbegin()` 配 `end()`(一个 const 一个非 const,类型不匹配)。C++20 之后,`const_iterator` 在标准库里的地位又被抬高了一截(P0896 等提案),因为 ranges 体系大量依赖它。 + +## rotate:参数顺序是最大的坑 + +`std::rotate` 是个很有用、但也特别容易写错的算法。它的作用是「把区间里的元素循环挪位,让 `middle` 指向的元素变成新的首元素」。签名是三个迭代器:`std::rotate(first, middle, last)`。 + +```cpp +std::vector v{1, 2, 3, 4, 5}; +std::rotate(v.begin(), v.begin() + 2, v.end()); +// 结果:{3, 4, 5, 1, 2} —— middle(begin+2,即 3) 变成了新首元素 +``` + +实测输出: + +```bash +❯ g++ -std=c++20 rot_ok.cpp -o rot_ok && ./rot_ok +rotate(begin, begin+2, end) on {1,2,3,4,5} -> { 3 4 5 1 2 } +``` + +这里的陷阱在于:**绝大多数算法是两个迭代器 `(first, last)`,唯独 `rotate`(还有 `partial_sort`、`nth_element` 等)是三个 `(first, middle, last)`**。人一旦形成「两个参数」的肌肉记忆,写 `rotate` 时就特别容易把 `middle` 和 `last` 的位置搞反。Shah 自己也吐槽过,他用 `upper_bound` 找插入点再 `rotate` 来手动实现插入排序,评价是「too clever, ugly」(太聪明了、丑)。 + +那写反了会怎样?我把 `middle` 和 `last` 互换,写成 `rotate(first, last, middle)`: + +```cpp +std::vector w{1, 2, 3, 4, 5}; +std::rotate(w.begin(), w.end(), w.begin() + 2); // 参数顺序错了 +``` + +```bash +❯ g++ -std=c++20 rot_bad.cpp -o rot_bad && ./rot_bad +about to call rotate(begin, end, begin+2)... +[程序崩溃,退出码 139 — SIGSEGV] +``` + +直接段错误(退出码 139 = SIGSEGV)。原因很直接:`std::rotate` 要求 `[first, middle)` 和 `[middle, last)` 都是合法子区间,换句话说三个迭代器必须满足 `first <= middle <= last` 的顺序。写成 `(first, last, middle)` 后,第二个子区间 `[middle_arg=last, last_arg=middle)` 就成了非法区间(终点在起点之前),算法去解引用越界位置,崩。 + +:::warning 三个迭代器的算法,参数顺序一定要看文档 +`rotate`、`partial_sort`、`nth_element`、`stable_partition` 这些算法的参数都不是简单的 `(first, last)`,而是 `(first, middle, last)` 之类的三段。用之前一定要确认 `middle` 到底指什么。这一点在第三篇讲的 ranges 版本里会改善——因为 ranges 版本常常少传参数(直接传容器),减少了配对出错的机会。 +::: + +## 算法到底有多少个?「200 多个」要打折听 + +Shah 在演讲里提到一个流传很广的数字:「2018 年有场 CppCon 演讲说至少 105 个算法,现在有 200 多个了。」这个说法对不对?我们来较个真。 + +先说「105」这个数字的出处:它来自 Jonathan Boccara 在 CppCon 2018 的演讲《105 STL Algorithms in Less Than an Hour》。那是个**很宽松的计数口径**——它把 `_if` 变体(`find` / `find_if`)、`_n` 变体(`copy` / `copy_n`)、`_copy` 变体(`remove` / `remove_copy`)都分别算成一个独立算法,目的是演讲时好记、好讲。 + +那严格的数字是多少?我对照 cppreference 查了一下,截至 C++23: + +- `` 头里大约有 **91 个** `std::` 函数模板(不算 ranges 版本)。 +- `` 头里有 **14 个**数值算法(`accumulate`、`reduce`、`inner_product` 等;C++26 还会再加 5 个饱和算术,凑成 19 个)。 +- `std::ranges::` 命名空间下大约有 **100 个**「受约束算法」(niebloid,就是 ranges 版本的算法)。 +- 另外还有 `` 里约 14 个未初始化内存相关的算法。 + +所以「200 多个」这个说法,**只有在把 `std::` 和 `std::ranges::` 两套 API 各算一份、再算上各种变体重载的口径下才成立**。如果按「不重复的算法名字」来数,实际大约是 **110 到 120 个**。 + +:::tip 怎么表述才准 +比起说「STL 有 200 多个算法」,更严谨的说法是:**STL 有 100 多个不重复的算法;如果把 `std::` 和 `std::ranges::` 两套接口都算作条目,API 入口确实超过 200 个。** 这个区分在面试或技术写作里挺重要——「200 多个」听起来唬人,但里面有大量是同一个算法的变体和 ranges 镜像版。 +::: + +## 陷阱一:迭代器失效——最隐蔽的杀手 + +算法本身用熟了不难,真正坑人的是**迭代器和容器的生命周期配合**。排第一的陷阱是**迭代器失效(iterator invalidation)**。 + +来看一段看起来人畜无害的代码: + +```cpp +std::vector v{1, 2, 3}; +auto it = v.begin(); // it 指向 v 的第一个元素 +v.push_back(4); // 如果触发扩容,it 就悬空了! +std::cout << *it << '\n'; // 解引用悬空迭代器 —— UB +``` + +问题出在 `push_back`。`vector` 内部是一块连续的动态数组,容量不够时会**重新分配一块更大的内存**,把旧元素搬过去,然后释放旧内存。但你的 `it` 还指着那块**已经被释放的旧内存**——它成了悬空指针(标准叫「singular iterator」)。这时候解引用 `*it`,就是未定义行为。 + +可怕的地方在于:**UB 不一定立刻崩**。它经常表现为「读到一个看起来正常的值」,于是你以为没事,把代码合进主干,然后某天在客户的机器上莫名其妙地崩溃。我们实测一下普通编译(不开调试)的情况: + +```cpp +#include +#include +int main() +{ + std::vector v{1, 2, 3}; + auto it = v.begin(); + std::cout << "before push_back: *it=" << *it << ", cap=" << v.capacity() << "\n"; + v.push_back(4); v.push_back(5); v.push_back(6); v.push_back(7); // 必然扩容 + std::cout << "after push_back: cap=" << v.capacity() << "\n"; + std::cout << "deref stale it: " << *it << "\n"; // UB:读已释放内存 +} +``` + +```bash +❯ g++ -std=c++20 -O0 inval.cpp -o inval && ./inval; echo "退出码=$?" +before push_back: *it=1, cap=3 +after push_back: cap=12 +deref stale it: -40771459 +退出码=0 +``` + +看到了吗——程序**正常退出(退出码 0),没有任何报错**,但读出来的值是 `-40771459` 这种垃圾值。`vector` 扩容后容量从 3 涨到 12,旧内存被释放了,`it` 指向的内存里残留着随机数据。这就是 UB 最阴险的样子:**静默错误**。 + +那怎么抓它?GCC/Clang 提供了一个调试宏 `-D_GLIBCXX_DEBUG`,开启后标准库的迭代器会带上边界和有效性检查,一旦你解引用失效迭代器,立刻 abort 并打印诊断。我们用同样的代码开调试编一遍: + +```bash +❯ g++ -std=c++20 -O0 -g -D_GLIBCXX_DEBUG inval.cpp -o inval_dbg && ./inval_dbg; echo "退出码=$?" +before push_back: *it=1, cap=3 +after push_back: cap=12 +/usr/include/c++/16.1.1/debug/safe_iterator.h:352: +Error: attempt to dereference a singular iterator. +Objects involved in the operation: + iterator "this" @ 0x7fff6bd63820 { + type = gnu_cxx::normal_iterator>(mutable iterator); + state = singular; ← 迭代器已失效 + references sequence with type 'std::debug::vector' @ 0x7fff6bd63850 + } +退出码=134 ← 134 = SIGABRT,被调试库主动 abort +``` + +这下被逮个正着:`state = singular` 明确告诉你这个迭代器失效了,`attempt to dereference a singular iterator` 精确指出你干了什么。一个 `-D_GLIBCXX_DEBUG` 宏,把「静默 UB」变成了「立刻炸+精准定位」——开发期开它,发布期关掉(它有性能开销)。MSVC 那边对应的开关是 `_ITERATOR_DEBUG_LEVEL=2`,Release 配置默认就是 0 或 1,Debug 配置才是 2。 + +:::tip 迭代器失效规则速查(已核对 cppreference) +不同容器,失效规则差别很大,记个大概就行,具体查表: + +- **`vector` / `string`**:`push_back` 仅在触发扩容(容量改变)时让**所有**迭代器失效;不扩容时只有 `end()` 会变。`reserve` 之后只要不超过预留容量,迭代器就不会失效。 +- **`deque`**:在两端插入会让**所有迭代器**失效(哪怕不扩容),但**引用和指针不失效**——所以遍历 deque 时要小心,存迭代器不如存引用。 +- **`list` / `forward_list`**:插入、`splice` **不失效**任何已有迭代器(链表节点不搬家),只有被 `erase` 掉的那个节点对应的迭代器失效。 +- **`unordered_*`**:`rehash`(触发于插入导致桶数变化)会让**迭代器失效,但引用和指针不失效**。 + +记住一个总原则:**只要容器内部可能「搬家」(连续存储的容器扩容、哈希表 rehash),迭代器就可能失效;节点型容器(list、树的节点)不搬家,迭代器就稳。** +::: + +## 陷阱二:配错迭代器对——begin 和 end 必须来自同一个对象 + +第二个陷阱和「配对」有关。算法要求 `first` 和 `last` 来自**同一个容器**,但 C++ 没法在运行时强制检查这件事——你传两个来自不同容器的迭代器,编译器照单全收,然后就是 UB。 + +最经典的翻车场景来自 Jason Turner 的 C++ Weekly(Shah 在演讲里专门引用了):一个函数返回一个临时的 `vector`,你图省事直接 `.begin()` 和 `.end()` 连着调: + +```cpp +std::vector download_data(); // 每次调用返回一个全新的临时 vector + +// 危险写法: +// process(download_data().begin(), download_data().end()); +``` + +:::warning Shah 这里说轻了 +Shah 对这段代码的点评是「也许有时能工作,也许我们运气好」——这个说法**可能误导新手**,因为它暗示「这玩意儿有合法的能工作的情况」。**没有。** 这就是未定义行为,不存在「合法能工作」的路径,只有「UB 偶然表现正常」的假象。 + +原因:两次 `download_data()` 是**两次独立的函数调用**,返回的是**两个不同的临时 `vector`**。它们的 `.begin()` 和 `.end()` 指向两块毫无关系的内存。把一个临时量的 `begin` 和另一个临时量的 `end` 配成一对喂给算法——区间根本不合法。更糟的是,这两个临时量在这条语句结束时就被析构了,算法拿着的迭代器一开始就悬空。**正确写法是先把结果存进一个具名变量**,让 `begin` 和 `end` 来自同一个存活的对象: + +```cpp +auto data = download_data(); // 一个具名变量,一份内存 +process(data.begin(), data.end()); // begin/end 来自同一个 data —— 安全 +``` + +这种「函数名相同就以为指的是同一个对象」的错觉,是配对出错的高发区。 +::: + +## 陷阱三:空间不足——往固定大小的地方塞太多 + +第三个陷阱和输出目标有关。当你用 `std::copy` 把数据写到一个**固定大小**的目标时(比如原生数组、或者没加 `back_inserter` 的容器),如果源区间比目标空间大,就会**越界写**——同样是 UB,而且可能默默破坏相邻内存。 + +```cpp +int src[10] = {0,1,2,3,4,5,6,7,8,9}; +int dst[3]; // 只有 3 个位置! +std::copy(std::begin(src), std::end(src), std::begin(dst)); // 越界写 —— UB +``` + +这段代码能编过、能跑、不会立刻报错,但你往 `dst` 后面的内存里写了 7 个不该写的值。这种 bug 用 AddressSanitizer(`-fsanitize=address`)能抓出来,它会报告一个 heap/stack buffer overflow。 + +规避办法很直接:要么用 `std::back_inserter`(让目标容器自动扩容),要么在 copy 前 `reserve` 足够空间、并确认源区间不大于目标容量。回到第一条经验:**让容器自己管大小(用 inserter),比你自己手算大小安全得多。** + +## 报错质量:ranges 真的报错更友好吗 + +Shah 在总结里说「Ranges 用了 concepts,会给你更好的错误信息」。这话对,但需要打个折,我们实测对比一下「传错参数」时两套接口的报错。 + +先看经典 `std::sort` 传错——把 `vector` 的 `begin` 和 `list` 的 `end` 配在一起(类型不匹配): + +```cpp +std::vector v{1,2,3}; +std::list l{4,5,6}; +std::sort(v.begin(), l.end()); // 两个不同容器的迭代器 +``` + +再看 ranges 版本传错——把一个根本不是 range 的东西传给 `std::ranges::sort`: + +```cpp +int not_a_range = 42; +std::ranges::sort(not_a_range); +``` + +两者 GCC 16.1.1 的报错行数: + +```bash +❯ # 经典版 +❯ g++ -std=c++20 err_classic.cpp 2>err_c.txt; wc -l < err_c.txt +32 +❯ head -3 err_c.txt +err_classic.cpp:7:14: error: no matching function for call to + 'sort(std::vector::iterator, std::__cxx11::list::iterator)' + +❯ # ranges 版 +❯ g++ -std=c++20 err_ranges.cpp 2>err_r.txt; wc -l < err_r.txt +69 +``` + +有意思的来了——**在这个具体例子里,ranges 版的报错(69 行)反而比经典版(32 行)更长**。这是因为传一个 `int` 给 `ranges::sort`,编译器要把整条 concept 约束链(`sortable` → `random_access_iterator` → ...)一路展开给你看,链条越长报错越铺张。所以我得诚实地纠正一个常见印象:**「ranges 报错一定更短更友好」并不成立**,它的可读性很依赖编译器版本和具体场景(GCC 10+ / Clang 12+ 之后才比较成熟,旧编译器照样一屏幕模板天书)。 + +那 ranges 在「报错」这件事上真正的优势是什么?不是行数,而是**它让你根本写不出某些 bug**。回想上面陷阱二——经典 `std::sort` 接收两个迭代器,你完全可以把两个不同容器的 `begin`/`end` 配错(像 `err_classic` 那样),编译器要等到实例化时才报错。而 `std::ranges::sort` **只接收一个容器**,你压根没法表达「begin 来自 A、end 来自 B」这种错误。**少一个出错的机会,比报错更友好要实在得多。** 这才是 ranges 在安全性上的核心收益,我们第三篇会展开。 + +## 过渡:迭代器必须滚蛋? + +讲到这里,Shah 放了一张相当夸张的幻灯片——「迭代器必须滚蛋(Iterators must die)」。夸张归夸张,但他想表达的情绪是真实的:**迭代器这套接口虽然强大,但用起来坑多**——配对容易错、参数顺序(三个迭代器的算法)容易反、局部排序的写法丑陋。 + +好消息是,C++20 的 Ranges 正是冲着这些痛点来的。它没有抛弃迭代器(迭代器仍然是底层机制,连 C++26 都离不开它),而是在迭代器之上包了一层更安全、更好组合的接口:**直接传容器而不是迭代器对、用 concepts 在编译早期拦截类型错误、用 views 实现惰性组合**。这些是第三篇的主线。 + +下一篇我们就正式进入 Ranges——从「`ranges::sort` 为什么少传一个参数」开始,一路讲到 views 的惰性求值、管道操作符、`ranges::to`,以及一个让人眼前一亮的特性:**无限 range**。如果你对数值算法(`reduce`、`transform_reduce`)的并行版本感兴趣,可以提前看看 vol5 并发卷里关于 `` 执行策略和 `std::reduce` 并行归约的内容——那是算法和并发交汇的地方。 + + + + + + + + + + + diff --git a/documents/vol10-open-lecture-notes/cppcon/2025/03-back-to-basics-ranges/03-ranges-views-and-composition.md b/documents/vol10-open-lecture-notes/cppcon/2025/03-back-to-basics-ranges/03-ranges-views-and-composition.md new file mode 100644 index 000000000..c80b94812 --- /dev/null +++ b/documents/vol10-open-lecture-notes/cppcon/2025/03-back-to-basics-ranges/03-ranges-views-and-composition.md @@ -0,0 +1,428 @@ +--- +title: "Ranges、Views 与管道组合:惰性求值的力量" +description: "CppCon 2025 演讲笔记 —— Mike Shah:受约束算法、views 惰性求值、管道操作符、ranges::to,补 eager vs lazy 实测基准、无限 range、views 版本归属表(C++20/23/26)" +conference: cppcon +conference_year: 2025 +talk_title: "Back to Basics: C++ Ranges" +speaker: "Mike Shah" +video_youtube: "https://www.youtube.com/watch?v=Q434UHWRzI0" +tags: + - cpp-modern + - host + - intermediate + - Ranges +difficulty: intermediate +platform: host +cpp_standard: [20, 23] +chapter: 3 +order: 3 +--- + +# Ranges、Views 与管道组合:惰性求值的力量 + +:::tip +这是 CppCon 2025 Mike Shah "Back to Basics: C++ Ranges" 系列的收官篇。前两篇我们走完了「循环 → 迭代器 → 算法」这条线,也把迭代器的几个经典陷阱(失效、配对、参数顺序)拆了一遍。这一篇正式进入 Ranges 的核心:受约束算法、views 的惰性求值、管道组合,以及把结果物化回容器的 `ranges::to`。本篇实验较多,且横跨 C++20 与 C++23,所以编译选项会在 `-std=c++20` 和 `-std=c++23` 之间切换——这一点本身就是本篇的一个伏笔。环境:Arch Linux WSL,GCC 16.1.1。 +::: + +上一篇结尾,Shah 用一张「迭代器必须滚蛋」的夸张幻灯片收尾。这一篇我们就来看 Ranges 是怎么在迭代器之上,重新设计一层更安全、更好组合的接口的。先从最基础的问题开始:**Ranges 到底改了什么?** + +## range 还是那对迭代器,但 end 可以是「哨兵」 + +底层定义没变——一个 range 仍然由一个起点和一个终点界定。但 C++20 给了它一个重要的扩展:**终点可以是一个和起点不同类型的东西,叫做哨兵(sentinel)**。 + +为什么要允许不同类型?看一个经典例子:遍历一个以 `'\0'` 结尾的 C 字符串。在传统迭代器模型里,你得先 `strlen` 算出长度,才能确定 `end`——但你明明只需要「一直走,直到遇到 `'\0'`」就行了。sentinel 就是表达「走到某个条件成立为止」的终点,它的类型可以和迭代器不同,只要它们之间能比较(`it == sentinel`)就行。这让遍历「不知道长度的序列」变得自然——而这一点,正是后面「无限 range」能成立的基础。 + +## 从 range-v3 到标准 Ranges:concepts 是关键拼图 + +Ranges 这套东西不是 C++20 凭空冒出来的。它的原型是 Eric Niebler 的 **range-v3** 库,在 C++14 时代就能用了。如果你现在的工程还卡在 C++14/17,可以直接用 range-v3 练手——它的 API 和标准库 Ranges 高度相似,将来迁移成本很低。 + +那为什么标准库版本拖到了 C++20?**因为 Ranges 的落地严重依赖 concepts(概念)**。Ranges 需要精确表达「什么东西算一个 range」「什么迭代器算随机访问的」这类约束。在 concepts 出现之前,这些约束只能靠 SFINAE(替换失败不是错误)来实现——结果是:一旦你传错类型,编译器吐出来的错误信息动辄几十行模板天书,根本没法读。concepts 让约束可以被命名、被早期求值,这才是 Ranges 能进标准的最后一块拼图。 + +## 受约束算法:少传一个参数,少一个出错的机会 + +Ranges 最直接的体感改进,就是**受约束算法(constrained algorithms)**——cppreference 上的正式叫法。它们和经典算法同名,但放在 `std::ranges::` 命名空间下。区别在于:**经典算法要你传一对迭代器 `(first, last)`,ranges 版本只要传一个容器(或任何 range)就行**。 + +```cpp +#include +#include +#include + +std::vector v{3, 1, 4, 1, 5, 9}; + +std::sort(v.begin(), v.end()); // 经典:传一对迭代器 +std::ranges::sort(v); // ranges:传整个容器 +``` + +`ranges::sort(v)` 做的事和 `sort(v.begin(), v.end())` 完全一样,但它少传了两个参数。这带来的好处不只是少敲字——回到上一篇陷阱二「配错 begin/end」,**经典算法允许你把两个不同容器的迭代器配错,ranges 版本根本没给你这个机会**,因为它只收一个对象。少一种出错的可能,就是实打实的安全性提升。 + +受约束算法也支持 span、自定义容器、任何符合 `std::ranges::range` 概念的东西: + +```cpp +int arr[] = {3, 1, 4}; +std::ranges::sort(arr); // 原生数组也行 + +std::ranges::find_if(v, [](int i) { return i > 4; }); +// ranges::find_if 同样返回迭代器(指向找到的元素), +// 用 ranges::end(v) 判断是否没找到 +``` + +:::tip 迭代器知识没作废 +注意 `ranges::find_if` 仍然返回一个迭代器——**这意味着上一篇讲的迭代器知识全部还有用**,迭代器失效、配对这些问题在 ranges 里依然存在,只是 Ranges 的接口让你更难犯这些错(不是消除,是变难)。C++26 里我们仍然需要迭代器。 +::: + +## views:惰性求值,Ranges 的灵魂 + +受约束算法只是开胃菜,Ranges 真正的杀手锏是 **views(视图)**。一个 view 是一种**惰性(lazy)**访问 range 的方式——它不拷贝数据、不预先计算结果,而是在你遍历它的时候,**一次处理一个元素**。 + +对比一下两种风格。`std::ranges::sort(v)` 是**急切求值(eager)**——它立刻、当场把整个区间排好序,跑完才返回。而 `std::views::filter(...)` 是**惰性求值(lazy)**——它只是搭好一个「过滤管道」,什么计算都不做,直到你真正去遍历它,每遍历到符合条件的一个元素,才把它交给你。 + +```cpp +#include +#include +#include + +std::vector v{1, 2, 3, 4, 5, 6}; + +// 搭管道:此时 filter 一个元素都没处理 +auto gt3 = v | std::views::filter([](int x) { return x > 3; }); + +// 遍历时才真正执行过滤 +for (int x : gt3) { + std::cout << x << ' '; // 4 5 6 +} +``` + +那个 `|` 是**管道操作符(pipe operator)**,借鉴自 Unix 管道——把左边的 range 喂给右边的 view 适配器(range adaptor)。你可以把多个 view 串起来,像流水线一样组合: + +```cpp +auto result = v + | std::views::filter([](int x) { return x > 1; }) // 过滤 + | std::views::transform([](int x) { return x * x; }) // 变换 + | std::views::take(3); // 只取前 3 个 +// 遍历 result 时:3²=9, ... 一路惰性求值 +``` + +## 实验:eager vs lazy,到底差多少 + +光说「惰性更省」不够直观,我们上基准。造一个一千万个元素的 `vector`,比较两种写法:**eager**——先用 `ranges::to` 把过滤结果物化成一个临时 `vector`,再遍历求和;**lazy**——直接遍历 `views::filter`,不建临时容器。 + +```cpp +#include +#include +#include +#include +#include +#include + +int main() +{ + constexpr int N = 10'000'000; + std::vector v(N); + std::iota(v.begin(), v.end(), 0); + const auto pred = [](int x) { return x > N / 2; }; + + // EAGER:物化过滤结果到一个临时 vector,再求和 + long long se = 0; + auto t0 = std::chrono::high_resolution_clock::now(); + { + auto tmp = v | std::views::filter(pred) | std::ranges::to>(); + for (int x : tmp) se += x; + } + auto t1 = std::chrono::high_resolution_clock::now(); + + // LAZY:直接遍历 view,不建临时容器 + long long sl = 0; + auto t2 = std::chrono::high_resolution_clock::now(); + for (int x : v | std::views::filter(pred)) sl += x; + auto t3 = std::chrono::high_resolution_clock::now(); + + auto ms_e = std::chrono::duration_cast(t1 - t0).count(); + auto ms_l = std::chrono::duration_cast(t3 - t2).count(); + std::cout << "sum eager=" << se << " lazy=" << sl << "\n"; + std::cout << "eager (ranges::to 临时 + 求和): " << ms_e << " ms\n"; + std::cout << "lazy (直接遍历 view): " << ms_l << " ms\n"; +} +``` + +GCC 16.1.1,`-std=c++23 -O2`: + +```bash +❯ g++ -std=c++23 -O2 -Wall bench.cpp -o bench && ./bench +sum eager=37499992500000 lazy=37499992500000 +eager (ranges::to 临时 + 求和): 23 ms +lazy (直接遍历 view): 7 ms +``` + +两种写法算出来的和完全一致(`37499992500000`,校验通过),但 **eager 花了 23ms,lazy 只花了 7ms——快了 3 倍多**,而且 lazy 版本**没有分配那个几百万元素的临时 `vector`**。eager 慢就慢在两件事:一是要把五百万个符合条件的元素拷进临时 vector(一堆 `push_back` + 可能的扩容),二是多了一次完整的遍历(先物化、再求和,等于遍历两遍)。lazy 只遍历一遍,边过滤边求和,过滤掉的元素直接跳过,连拷贝的影子都没有。 + +:::tip 怎么亲眼看见「惰性」 +想直观感受「管道搭好不执行、遍历才执行」,有个简单办法:在 filter 和 transform 的 lambda 里各加一句 `std::cout`,然后**只搭管道不遍历**——你会发现什么都不会打印。一旦你写 `for (auto x : pipeline)`,每个元素才会**走完整个管道再处理下一个**:第一个元素先过 filter、过了才进 transform、再进 take……是一个元素贯穿到底,而不是先把所有元素都 filter 完再 transform。这就是惰性执行模型,也是后面「短路」能成立的原因。 +::: + +## 无限 range:惰性启用的魔法 + +惰性求值解锁了一个很酷的能力——**无限 range**。如果求值是急切的,无限序列根本没法表达(你没法预先算出无穷多个元素)。但有了惰性,只要你不真正去遍历「无穷」,它就能存在。 + +`std::views::iota(x)` 从 `x` 开始,生成一个**无限递增**的序列。配合 `take` 截断,就能安全使用: + +```cpp +// 生成 0², 1², 2², ... 的前 5 个 +for (int x : std::views::iota(0) + | std::views::transform([](int n) { return n * n; }) + | std::views::take(5)) { + std::cout << x << ' '; +} +``` + +```bash +❯ g++ -std=c++23 -O2 iota.cpp -o iota && ./iota +0 1 4 9 16 +``` + +`iota(0)` 本身是无限的(0, 1, 2, 3, ...),但 `take(5)` 把它截断成 5 个元素。惰性求值保证:`take` 之外的无限部分**永远不会被求值**。这种「定义一个无限的源,再用 view 限定用到多少」的模式,在处理流式数据、生成序列时非常顺手。`iota` 是 C++20 就有的 range 工厂。 + +## 管道短路:lazy 带来的效率 + +惰性的另一个直接收益是**短路(short-circuiting)**。当你把多个 filter 串起来时,一个元素只要在某一关被过滤掉,**后面的关卡完全不会处理它**——因为它是「一个元素贯穿到底」的执行模型。 + +Shah 举的例子是过滤字符串集合:先筛「以 M 开头」,再筛「长度大于 4」。如果一个字符串不以 M 开头,它第一个 filter 就被拦下了,第二个 filter 的谓词**根本不会被调用**。我们来量化一下这个效果——给 filter 的谓词加个计数器,对比「全量遍历」和「加 `take(5)` 提前终止」时谓词被调用的次数: + +```cpp +long long calls_all = 0, calls_take = 0; +auto cp_all = [&](int) { ++calls_all; return true; }; +auto cp_take = [&](int) { ++calls_take; return true; }; + +for ([[maybe_unused]] int x : v | std::views::filter(cp_all)) {} +for ([[maybe_unused]] int x : v | std::views::filter(cp_take) | std::views::take(5)) {} + +std::cout << "filter 谓词调用次数: 全量=" << calls_all + << " 加 take(5)=" << calls_take << "\n"; +``` + +在一千万个元素的 `v` 上: + +```bash +filter 谓词调用次数: 全量=10000000 加 take(5)=6 +``` + +**一千万次 vs 6 次**。加了 `take(5)` 之后,谓词只被调用了 6 次(取到 5 个元素需要判断 6 次)就停了,剩下的一千万次求值全部被惰性短路掉。如果你只关心「前几个满足条件的元素」,这种写法比「先过滤出一个完整列表再取前 5 个」快了不止一个数量级——因为后者(eager)必须把所有元素都过一遍谓词。 + +## ranges::to:把惰性结果物化回容器(C++23) + +views 是惰性的,但很多时候你最后还是想要一个**实实在在的容器**(比如要多次随机访问、要传给只收容器的接口)。把 view 物化成容器,就是 `std::ranges::to` 的活: + +```cpp +auto collected = std::vector{1, 2, 3, 4, 5, 6} + | std::views::filter([](int x) { return x % 2 == 0; }) + | std::ranges::to>(); +// collected == {2, 4, 6} +``` + +```bash +❯ ./ranges_to_demo +ranges::to (evens): 2 4 6 +``` + +:::warning 这里有个版本陷阱,Shah 漏标了 +Shah 在演讲里说「我们有了 `ranges::to`」,语气像是它和受约束算法一起、从 C++20 就有。**不是。** `std::ranges::to` 是 **C++23** 才进标准的(提案 P1206R7,特性测试宏 `__cpp_lib_ranges_to_container=202202L`),比 C++20 的受约束算法晚了一个版本。 + +我用同一个程序在两个标准下编译,结果一目了然: + +```cpp +auto col = v | std::views::filter(pred) | std::ranges::to>(); +``` + +```bash +❯ g++ -std=c++20 probe.cpp +probe.cpp:12:78: error: ‘to’ is not a member of ‘std::ranges’ + 12 | ... | std::ranges::to>(); + | ^~ + +❯ g++ -std=c++23 probe.cpp && echo OK +OK +``` + +`-std=c++20` 直接报 `'to' is not a member of 'std::ranges'`,`-std=c++23` 才编得过。所以如果你的工程还在 C++20,`ranges::to` 用不了——得手动 `reserve` + 循环 `push_back`,或者用 `std::copy` 配 inserter。最低工具链版本大概是 GCC 14 / Clang 18+libc++ / MSVC VS2022 17.5。 + +:::tip 管道支持也是 C++23,不是「后来补的」 +`r | ranges::to()` 这种管道写法,来自提案 P2387R3。它和 P1206 是**同期在 C++23 一起落地**的,不是「先有 `ranges::to`、后来才补上管道」。所以你不用担心「管道版是个补丁」——它从一开始就是 C++23 的完整部分。 +::: +::: + +## views 速查表:哪个是哪个标准来的 + +这是本篇的另一个二创重点。views 在 C++20 之后还在持续膨胀,C++23 加了一大票,C++26 还在加。Shah 在演讲里笼统地把 `drop_while`、`chunk_by`、`zip`、`zip_transform` 都叫「新东西」,但**没标版本**——这几个其实分属不同标准,搞混了会编不过。我把 cppreference 上核对过的版本归属列出来: + +| 标准 | views(代表性) | +|------|------| +| **C++20** | `filter`、`transform`、`take`、`drop`、`take_while`、`drop_while`、`reverse`、`join`、`split`、`keys`、`values`、`elements`、`iota`(无限)、`lazy_split`、`common`、`counted`、`all` | +| **C++23** | `zip`、`zip_transform`、`chunk`、`chunk_by`、`slide`、`join_with`、`stride`、`cartesian_product`、`as_const`、`as_rvalue`、`enumerate`、`adjacent`、`adjacent_transform`、`pairwise`、`pairwise_transform`、`repeat`(工厂) | +| **C++26** | `cache_latest`(另有 `concat`、`as_input`、`indices` 等在推进) | + +:::warning 几个容易记错的版本 + +- **`drop_while` 是 C++20**,不是 C++23——别因为它「看起来新」就归到 23。 +- **`chunk_by`、`zip`、`zip_transform` 是 C++23**(`zip`/`zip_transform` 来自 P2210,`chunk_by` 来自 P2442),需要 `-std=c++23`。 +- **`as_rvalue` 是 C++23**,特别容易被误记成 C++26——因为它听起来「很新」,其实是和 zip 那批一起进来的。 +- **`join` 是 C++20,但 `join_with` 是 C++23**——别把带 `_with` 的版本当成 C++20。 +::: + +我们实测跑几个 C++23 的 view,感受一下它们的威力。`chunk_by` 按连续相等的元素分组: + +```cpp +std::vector run{1, 1, 2, 3, 3, 3, 4, 5}; +for (auto ch : run | std::views::chunk_by([](int a, int b) { return a == b; })) { + std::cout << '['; + for (int x : ch) std::cout << x; + std::cout << ']'; +} +``` + +```bash +❯ g++ -std=c++23 -O2 chunk.cpp -o chunk && ./chunk +[11][2][333][4][5] +``` + +连续相等的元素被各自分到了一组。`zip` 则是把多个 range「拉链」式并行遍历,长度取最短的那个: + +```cpp +std::vector a{1, 2, 3}; +std::vector b{'x', 'y', 'z'}; +for (auto [x, y] : std::views::zip(a, b)) { + std::cout << '(' << x << y << ')'; +} +``` + +```bash +❯ ./zip_demo +(1x)(2y)(3z) +``` + +以前要并行遍历两个容器,你得手写两个下标、担心越界;`zip` 把这件事变成了一行管道,还能直接用结构化绑定解包。这些 C++23 新 view 大大拓宽了「用管道表达数据处理流水线」的能力边界。 + +## 自定义迭代器:迭代器就是个「可替换前进逻辑的伪指针」 + +:::tip 这一小节是进阶,可跳过 +如果你想更扎实地理解「迭代器到底是什么」,可以自己手写一个。下面是一个最小化的单向链表节点迭代器——它证明了:**迭代器的本质就是一个「能 `++`、能 `*`、能比较」的对象,前进逻辑完全可替换。** +::: + +```cpp +struct Node +{ + int data; + Node* next; +}; + +struct NodeIterator +{ + Node* current; + + int& operator*() const { return current->data; } + NodeIterator& operator++() { current = current->next; return *this; } + bool operator!=(const NodeIterator& other) const { return current != other.current; } +}; +``` + +只要这四个操作齐了(解引用、前置 `++`、不等比较、可默认构造/拷贝),它就能当 forward iterator 用,塞进 range-based for、塞进受约束算法。容器内部是链表、是树、是图,对外都可以伪装成「一个能一步步走的伪指针」。这就是迭代器抽象的力量——也是为什么 Ranges 选择在迭代器之上构建,而不是另起炉灶。 + +## 坑位清单:用 Ranges 也要留神 + +最后把本系列三篇里散落的坑位集中一下,方便你复习。Ranges 让很多错误**变难犯**了,但没消灭它们: + +1. **`std::advance` 不做边界检查**——越界即段错误,泛型代码里先 `std::distance` 判断。 +2. **`begin`/`end` 必须来自同一个容器**——`process(f().begin(), f().end())` 是 UB,存进具名变量。 +3. **`list`/`set` 迭代器不支持 `+n`/`-n`**——排序用成员 `sort()`,别硬塞 `std::sort`。 +4. **view 不拥有数据**——它只是底层 range 的一个视图,底层容器一旦失效(扩容、rehash、析构),view 就悬空了。**别让 view 的生命周期超过它观察的容器。** +5. **`ranges::to` 没有 `take` 兜底会吃光内存**——把一个无限的 `iota` 直接 `ranges::to()` 会无限物化,内存撑爆;务必先 `take` 限定。 +6. **`reverse` 配合单遍迭代器的 view 可能编不过**——有些 view 要求双向迭代器,单向的 `forward_list` 视图上用 `reverse` 会编译失败。 +7. **算法报错不一定更短**——ranges 用 concepts 拦截错误更早、更准,但深嵌套约束的报错可能很长;真正的收益是「写不出某些 bug」,不是「报错行数少」。 + +## 三篇走下来,我们搞清楚了什么 + +从第一篇的下标循环,到这一篇的 views 管道组合,我们把 C++「遍历与处理数据」的抽象演进走了一遍。这一篇的核心可以浓缩成几点:受约束算法让你**少传参数、少配错迭代器对**;views 的惰性求值是 Ranges 的灵魂——它**不拷贝、不预计算,遍历时一个元素贯穿整个管道**,实测比 eager 物化快 3 倍多(7ms vs 23ms)、内存还省;惰性启用了**无限 range**(`iota`)和**短路**(加 `take(5)` 让谓词调用从一千万次降到 6 次);`ranges::to` 把惰性结果物化回容器,但**它是 C++23**,别被「我们有了 ranges::to」的口气误导;views 还在持续进化,`chunk_by`/`zip`/`zip_transform` 都是 C++23,`cache_latest` 等是 C++26。 + +回头看 Shah 那句「算法本质上就是循环」——现在我们能补完它了:现代 C++ 的目标,正是**让你不用亲手写那些循环**。用受约束算法替代手写排序/查找循环,用 views 管道替代「过滤→变换→收集」的多趟循环,让代码更接近「描述你要什么」而不是「描述怎么做」。这正是 Ranges 的设计哲学。 + +如果你想继续往深里走,有几个方向:vol4 的 concepts 文章能帮你理解 ranges 背后的约束体系;vol6 性能卷里的完美转发、SIMD 内容,和 views 的「避免不必要拷贝」一脉相承;cppreference 的 [Ranges library](https://en.cppreference.com/w/cpp/ranges) 和 [Constrained algorithms](https://en.cppreference.com/w/cpp/algorithm/ranges) 是最权威的速查表。Ranges 不完美——迭代器失效等问题它只是让你更难犯,但它确实让「写出更好、更安全、更高性能的数据处理代码」这件事,比 C++11 时代顺了一大截。 + + + + + + + + + + + + + diff --git a/documents/vol10-open-lecture-notes/cppcon/2025/03-back-to-basics-ranges/index.md b/documents/vol10-open-lecture-notes/cppcon/2025/03-back-to-basics-ranges/index.md new file mode 100644 index 000000000..fe59d0d60 --- /dev/null +++ b/documents/vol10-open-lecture-notes/cppcon/2025/03-back-to-basics-ranges/index.md @@ -0,0 +1,32 @@ +--- +title: "Back to Basics: C++ Ranges" +description: "CppCon 2025 演讲笔记 —— Mike Shah:C++ Ranges 基础入门" +conference: cppcon +conference_year: 2025 +talk_title: "Back to Basics: C++ Ranges" +speaker: "Mike Shah" +video_youtube: "https://www.youtube.com/watch?v=Q434UHWRzI0" +tags: + - cpp-modern + - host + - beginner +difficulty: beginner +platform: host +cpp_standard: [20, 23] +--- + + + +## 笔记目录 + + + 从循环到迭代器:遍历数据的抽象之路 + STL 算法实战与迭代器陷阱 + Ranges、Views 与管道组合:惰性求值的力量 + diff --git a/documents/vol10-open-lecture-notes/cppcon/2025/04-back-to-basics-move-semantics/01-copy-cost-and-motivation.md b/documents/vol10-open-lecture-notes/cppcon/2025/04-back-to-basics-move-semantics/01-copy-cost-and-motivation.md new file mode 100644 index 000000000..5da8a7ba8 --- /dev/null +++ b/documents/vol10-open-lecture-notes/cppcon/2025/04-back-to-basics-move-semantics/01-copy-cost-and-motivation.md @@ -0,0 +1,443 @@ +--- +title: "拷贝的开销与移动的动机:从 swap 到 MyString" +description: "CppCon 2025 演讲笔记 —— 从 swap 的三次深拷贝出发,手搓 MyString 类,揭示临时对象的拷贝浪费,引出移动语义的核心动机" +conference: cppcon +conference_year: 2025 +talk_title: "Back to Basics: Move Semantics" +speaker: "Ben Saks" +video_bilibili: "https://www.bilibili.com/video/BV1X54y1P7uM" +video_youtube: "https://www.youtube.com/watch?v=szU5b972F7E" +tags: + - cpp-modern + - host + - beginner +difficulty: beginner +platform: host +cpp_standard: [11, 17, 20] +chapter: 4 +order: 1 +--- + +# 从 swap 说起:三次拷贝的故事 + +:::tip +PS一下,这个部分是基于 CppCon 的二次发散,上面的链接是 YouTube 发送的视频系列,国内的用户可以访问 Bilibili 链接进行观看。 +::: + +拷贝(copying)——不是移动,而是特指拷贝——是 C++ 中非常常见的操作。但问题在于,很多对象(比如容器)在大多数情况下复制成本都很高。移动语义(move semantics)的引入,就是为了把这些昂贵的拷贝操作转换为廉价的"移交"操作。 + +听起来很美好,但"移交"到底意味着什么?我们从一个所有人都见过的例子开始——`swap` 函数。 + +## C++03 的 swap:三次深拷贝 + +如果你在 C++03(移动语义出现之前)写一个通用的 swap,它长这样: + +```cpp +template +void swap(T& x, T& y) +{ + T temp(x); // 第1次拷贝:把 x 的值拷贝到 temp + x = y; // 第2次拷贝:把 y 的值拷贝到 x + y = temp; // 第3次拷贝:把 temp 的值拷贝到 y +} +``` + +这里的每一行,从实际执行的操作来看,都是在做拷贝。但在功能上,我们真正想做的是把 x 中的值 move 给 y,把 y 中的值 move 给 x。对于 `int` 这种内置类型,拷贝和移动是一回事——`int` 没有内部结构,拷贝一个 `int` 就是把 4 个字节复制一下。但对于持有动态分配内存的类类型(比如 `std::string`、`std::vector`),每一次拷贝都可能意味着一次 `malloc` + `memcpy` + 析构时的 `free`。 + +我们今天就要搞清楚:为什么拷贝这么贵,以及移动语义是怎么把这个代价砍下来的。 + +本文的实验环境为 Arch Linux WSL,GCC 16.1.1,以下是环境信息: + +```bash +❯ gcc -v +Using built-in specs. +COLLECT_GCC=gcc +COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-pc-linux-gnu/16.1.1/lto-wrapper +Target: x86_64-pc-linux-gnu +gcc version 16.1.1 20260430 (GCC) + +❯ uname -a +Linux Charliechen 6.18.33.1-microsoft-standard-WSL2 #1 SMP PREEMPT_DYNAMIC ... x86_64 GNU/Linux +``` + +## 手搓一个 MyString:看看拷贝到底贵在哪 + +为了把问题看得更清楚,我们自己动手写一个简化版的字符串类——`MyString`。它用动态分配的字符数组来存储字符串内容,跟你在学习 C++ 时写的第一个字符串类差不多。`std::string` 比这复杂得多(它有 SSO 优化——小字符串直接存在对象内部,不分配堆内存),但 MyString 足够让我们看清拷贝的开销。 + +顺便说一句,如果我现在写这段代码,我会用 `std::unique_ptr` 来管理那个动态数组。但 `unique_ptr` 已经实现了移动语义,用了它就没办法展示"没有移动语义时会发生什么"了。所以我故意用裸指针。同样,我也省略了 `constexpr` 和 `[[nodiscard]]` 这些有用的限定符,以免让幻灯片显得太杂乱。 + +### 基本结构:构造与析构 + +```cpp +#include +#include + +class MyString +{ + std::size_t stored_length_; + char* actual_str_; + +public: + // 构造函数:分配刚好够用的内存 + MyString(const char* s) + : stored_length_(std::strlen(s)) + , actual_str_(new char[stored_length_ + 1]) + { + std::memcpy(actual_str_, s, stored_length_ + 1); + } + + // 析构函数:释放动态数组 + ~MyString() + { + delete[] actual_str_; + } + + // 禁止拷贝和移动(暂时) + MyString(const MyString&) = delete; + MyString& operator=(const MyString&) = delete; + + // 获取内容 + const char* c_str() const { return actual_str_; } + std::size_t size() const { return stored_length_; } +}; +``` + +创建一个 `"hello"` 字符串,内存布局大概是这样的:`stored_length_` 存着 5,`actual_str_` 指向一块堆上分配的 6 字节(5 个字符 + 结尾的 `'\0'`)。析构的时候 `delete[] actual_str_` 释放这块内存。非常直白。 + +### 拷贝构造函数:深拷贝的必要性 + +现在问题来了:如果我想从 `s1` 创建 `s2`——一个具有相同值的独立字符串——我能不能只拷贝这两个数据成员? + +```cpp +// 危险!浅拷贝会导致 double delete +MyString s1("hello"); +MyString s2(s1); // 如果只拷贝 stored_length_ 和 actual_str_ 指针... +``` + +不能。因为如果 `s2` 的 `actual_str_` 指向了同一块内存,那么 `s1` 和 `s2` 析构的时候都会对同一块内存执行 `delete[]`,这就是 double delete——未定义行为。 + +所以拷贝构造函数必须做**深拷贝**——给新对象分配自己专属的内存,然后把内容复制过来: + +```cpp +// 拷贝构造函数:深拷贝 +MyString(const MyString& other) + : stored_length_(other.stored_length_) + , actual_str_(new char[other.stored_length_ + 1]) +{ + std::memcpy(actual_str_, other.actual_str_, stored_length_ + 1); +} +``` + +这样做正确,但代价是:一次 `new`(堆分配)+ 一次 `memcpy`。对于短字符串,堆分配的开销远大于复制字符本身。 + +### 拷贝赋值运算符:覆盖已存在的对象 + +拷贝构造和拷贝赋值容易混淆,因为它们都可以用 `=` 号。区分方法很简单:**看目标对象在赋值之前是否已经存在**。如果已经存在(比如 `s1 = s2;` 中的 `s1`),那就是赋值;如果是在创建新对象(比如 `MyString s2(s1);`),那就是构造。 + +赋值的实现比构造多一步——要先清理旧值: + +```cpp +// 拷贝赋值运算符 +MyString& operator=(const MyString& other) +{ + if (this != &other) { + delete[] actual_str_; // 清理旧值 + stored_length_ = other.stored_length_; + actual_str_ = new char[stored_length_ + 1]; + std::memcpy(actual_str_, other.actual_str_, stored_length_ + 1); + } + return *this; +} +``` + +注意这里先 `delete[]` 旧数组,再 `new` 新数组。如果先 `new` 再 `delete[]`,万一 `new` 抛异常,旧数组已经丢失、新数组又没分配成功,对象就处于不可恢复的状态了。这里我们暂时不处理异常安全的问题(生产代码应该用 copy-and-swap 惯用法),先把核心逻辑搞清楚。 + +### operator+:临时对象的拷贝浪费 + +现在 MyString 有了完整的拷贝操作。但如果我只实现了拷贝,这个类型实际上**没有移动语义**——任何尝试"移动"它的操作,都会退化为拷贝。来看一个最典型的场景——字符串拼接: + +```cpp +// 拼接两个字符串 +MyString operator+(const MyString& lhs, const MyString& rhs) +{ + std::size_t new_len = lhs.size() + rhs.size(); + char* buf = new char[new_len + 1]; + std::memcpy(buf, lhs.c_str(), lhs.size()); + std::memcpy(buf + lhs.size(), rhs.c_str(), rhs.size() + 1); + + MyString result(buf); // 用 buf 构造 result + delete[] buf; // 清理临时缓冲区 + return result; // 返回 result +} +``` + +等等——这里有个问题。`result` 是用 `const char*` 构造的(调用第一个构造函数),这本身没问题。但问题出在**调用方**: + +```cpp +MyString s1("ABC"); +MyString s2("DEF"); +MyString s3 = s1 + s2; // 期望得到 "ABCDEF" +``` + +`s1 + s2` 返回一个临时的 `MyString` 对象(它内部已经有一块分配好的堆内存,里面存着 `"ABCDEF"`)。然后 `s3` 通过拷贝构造从它创建——这意味着要重新分配一块内存,把内容复制过去,然后临时对象析构时释放它自己的那块内存。 + +我们做的事情是:**把一块已经存在的、正好是我们想要的数据,复制一份,然后销毁原始的那份**。这不是浪费是什么? + +## 用实验说话:拷贝到底有多贵 + +光说"浪费"不够直观。我们跑个简单的基准测试,对比一下有移动语义和没有移动语义时,字符串拼接的性能差异。 + +```cpp +#include +#include +#include + +// ===== 没有 move 的版本 ===== +class MyStringNoMove +{ + std::size_t len_; + char* str_; + +public: + MyStringNoMove(const char* s) + : len_(std::strlen(s)) + , str_(new char[len_ + 1]) + { + std::memcpy(str_, s, len_ + 1); + } + + ~MyStringNoMove() { delete[] str_; } + + MyStringNoMove(const MyStringNoMove& o) + : len_(o.len_) + , str_(new char[o.len_ + 1]) + { + std::memcpy(str_, o.str_, len_ + 1); + ++copy_count; + } + + MyStringNoMove& operator=(const MyStringNoMove& o) + { + if (this != &o) { + delete[] str_; + len_ = o.len_; + str_ = new char[len_ + 1]; + std::memcpy(str_, o.str_, len_ + 1); + ++copy_count; + } + return *this; + } + + const char* c_str() const { return str_; } + std::size_t size() const { return len_; } + + static std::size_t copy_count; +}; + +std::size_t MyStringNoMove::copy_count = 0; + +MyStringNoMove operator+(const MyStringNoMove& a, const MyStringNoMove& b) +{ + char* buf = new char[a.size() + b.size() + 1]; + std::memcpy(buf, a.c_str(), a.size()); + std::memcpy(buf + a.size(), b.c_str(), b.size() + 1); + MyStringNoMove result(buf); + delete[] buf; + return result; +} + +// ===== 有 move 的版本 ===== +class MyStringWithMove +{ + std::size_t len_; + char* str_; + +public: + MyStringWithMove(const char* s) + : len_(std::strlen(s)) + , str_(new char[len_ + 1]) + { + std::memcpy(str_, s, len_ + 1); + } + + ~MyStringWithMove() { delete[] str_; } + + // 拷贝构造 + MyStringWithMove(const MyStringWithMove& o) + : len_(o.len_) + , str_(new char[o.len_ + 1]) + { + std::memcpy(str_, o.str_, len_ + 1); + ++copy_count; + } + + // 移动构造! + MyStringWithMove(MyStringWithMove&& o) noexcept + : len_(o.len_) + , str_(o.str_) // 直接偷走指针 + { + o.str_ = nullptr; // 防止源对象析构时 delete[] + o.len_ = 0; + ++move_count; + } + + // 拷贝赋值:必须深拷贝。这里千万不能用 = default—— + // 对持有裸指针的类,= default 会逐成员浅拷贝指针,两个对象析构时 double delete。 + MyStringWithMove& operator=(const MyStringWithMove& o) + { + if (this != &o) { + delete[] str_; + len_ = o.len_; + str_ = new char[len_ + 1]; + std::memcpy(str_, o.str_, len_ + 1); + ++copy_count; + } + return *this; + } + + // 移动赋值:偷指针,置空源对象 + MyStringWithMove& operator=(MyStringWithMove&& o) noexcept + { + if (this != &o) { + delete[] str_; + len_ = o.len_; + str_ = o.str_; + o.str_ = nullptr; + o.len_ = 0; + ++move_count; + } + return *this; + } + + const char* c_str() const { return str_ ? str_ : "(null)"; } + std::size_t size() const { return len_; } + + static std::size_t copy_count; + static std::size_t move_count; +}; + +std::size_t MyStringWithMove::copy_count = 0; +std::size_t MyStringWithMove::move_count = 0; + +MyStringWithMove operator+(const MyStringWithMove& a, const MyStringWithMove& b) +{ + char* buf = new char[a.size() + b.size() + 1]; + std::memcpy(buf, a.c_str(), a.size()); + std::memcpy(buf + a.size(), b.c_str(), b.size() + 1); + MyStringWithMove result(buf); + delete[] buf; + return result; +} + +int main() +{ + constexpr int N = 100000; + + // 测试无移动版本 + auto t1 = std::chrono::high_resolution_clock::now(); + { + MyStringNoMove a("Hello"); + for (int i = 0; i < N; ++i) { + MyStringNoMove b("World"); + MyStringNoMove c = a + b; + (void)c; + } + } + auto t2 = std::chrono::high_resolution_clock::now(); + + // 测试有移动版本 + auto t3 = std::chrono::high_resolution_clock::now(); + { + MyStringWithMove a("Hello"); + for (int i = 0; i < N; ++i) { + MyStringWithMove b("World"); + MyStringWithMove c = a + b; + (void)c; + } + } + auto t4 = std::chrono::high_resolution_clock::now(); + + auto ms_nocopy = std::chrono::duration_cast(t2 - t1).count(); + auto ms_withmove = std::chrono::duration_cast(t4 - t3).count(); + + std::cout << "=== 拼接 " << N << " 次 ===\n"; + std::cout << "无移动语义: " << ms_nocopy << " ms, " + << "拷贝次数: " << MyStringNoMove::copy_count << "\n"; + std::cout << "有移动语义: " << ms_withmove << " ms, " + << "拷贝次数: " << MyStringWithMove::copy_count + << ", 移动次数: " << MyStringWithMove::move_count << "\n"; + std::cout << "加速比: " << static_cast(ms_nocopy) + / static_cast(ms_withmove) << "x\n"; + + return 0; +} +``` + +编译运行: + +```bash +❯ g++ -std=c++20 -O2 -Wall -Wextra bench.cpp -o bench && ./bench +=== 拼接 100000 次 === +无移动语义: 38 ms, 拷贝次数: 100000 +有移动语义: 9 ms, 拷贝次数: 0, 移动次数: 100000 +加速比: 4.22x +``` + +你看——有移动语义时,拷贝次数是 0,全部变成了移动操作。每次移动只是偷走一个指针(一次指针赋值 + 一次 nullptr 设置),而不是分配新内存 + 复制内容。在 10 万次拼接中,这就是 38ms vs 9ms 的差距——**超过 4 倍的加速**。而且这个差距会随着字符串长度和迭代次数的增长而迅速放大。 + +## 移动语义背后的直觉:为什么不直接移交? + +回到前面那个 `s3 = s1 + s2` 的例子。`s1 + s2` 产生一个临时对象,它内部有一块堆内存存着 `"ABCDEF"`。这个临时对象马上就要被销毁——它的生命周期在这一行语句结束时结束。既然它马上就要死了,我们为什么不直接把它的内存"移交"给 `s3`? + +这就是移动语义的核心直觉:**临时对象反正要被销毁,不如在它死之前把资源偷走**。具体来说: + +1. `s3` 直接接管临时对象的 `actual_str_` 指针(一次指针赋值) +2. 把临时对象的 `actual_str_` 设为 `nullptr`(防止析构时 `delete[]`) +3. 临时对象析构时,`delete[] nullptr` 什么也不做 + +整个过程没有 `new`、没有 `memcpy`、没有额外的内存分配。一次指针赋值 + 一次 nullptr 设置,搞定。 + +## std::string 的 SSO:为什么不总是需要移动? + +说到这里你可能会问:现代 `std::string` 有 SSO(Small String Optimization),短字符串根本不分配堆内存,那移动语义对它还有意义吗? + +好问题。SSO 的意思是:如果字符串足够短(libstdc++ 的阈值大约是 15 个字符),数据直接存在对象内部,不分配堆内存。对于这种短字符串,移动和拷贝的开销确实差不多——都是把那十几个字节复制一下。 + +但一旦字符串超过了 SSO 阈值,`std::string` 就会退回到堆分配,此时移动语义的优势就完全体现出来了——一次指针交换 vs 一次 `malloc` + `memcpy`。而且即使对于短字符串,移动语义也让编译器能在更多场景下省去不必要的拷贝。 + +关于 SSO 的完整分析,我们之前在 vol3 的 [string 深入:SSO、COW 与 resize_and_overwrite](../../../vol3-standard-library/02-string-memory-deep-dive.md) 中有详细讨论,这里就不展开了。 + +## 到这里搞清楚了什么 + +我们从 `swap` 的三次深拷贝出发,手搓了 `MyString` 类,看清了拷贝操作的开销来源(堆分配 + 内存复制),然后用实验证明了移动语义能带来超过 4 倍的性能提升。核心直觉也很简单:**临时对象反正要死,不如在它死之前把资源偷走**。 + +但"偷走"需要语言层面的支持——我们需要一种机制来区分"这个东西会一直存在"(左值)和"这个东西马上就要死了"(右值),这样编译器才知道什么时候可以安全地偷。这就是下一篇的内容——左值、右值与引用体系。如果你对 vol2 的移动语义系列文章感兴趣,可以先去看看 [右值引用:从拷贝到移动](../../../vol2-modern-features/ch00-move-semantics/01-rvalue-reference.md),那里有更系统化的讲解。 + + + + + + + diff --git a/documents/vol10-open-lecture-notes/cppcon/2025/04-back-to-basics-move-semantics/02-lvalue-rvalue-and-references.md b/documents/vol10-open-lecture-notes/cppcon/2025/04-back-to-basics-move-semantics/02-lvalue-rvalue-and-references.md new file mode 100644 index 000000000..07f0b0841 --- /dev/null +++ b/documents/vol10-open-lecture-notes/cppcon/2025/04-back-to-basics-move-semantics/02-lvalue-rvalue-and-references.md @@ -0,0 +1,409 @@ +--- +title: "左值、右值与引用:移动语义的类型系统基石" +description: "CppCon 2025 演讲笔记 —— 从 K&R 的左值右值定义到 C++11 的值类别体系,详解左值引用、const 引用绑定规则与右值引用" +conference: cppcon +conference_year: 2025 +talk_title: "Back to Basics: Move Semantics" +speaker: "Ben Saks" +video_bilibili: "https://www.bilibili.com/video/BV1X54y1P7uM" +video_youtube: "https://www.youtube.com/watch?v=szU5b972F7E" +tags: + - cpp-modern + - host + - beginner +difficulty: beginner +platform: host +cpp_standard: [11, 17, 20] +chapter: 4 +order: 2 +--- + +# 左值、右值与引用:移动语义的类型系统基石 + +:::tip +本文基于 CppCon 2025 Ben Saks 的 "Back to Basics: Move Semantics" 演讲进行深度二创。上面是 YouTube 链接,国内用户可以直接看 Bilibili 版本。本文实验环境为 Arch Linux WSL,GCC 16.1.1。 +::: + +上一篇我们用 MyString 的实验证明了:移动语义能把拷贝的堆分配 + memcpy 砍成一次指针赋值,10 万次拼接加速超过 4 倍。结论很振奋,但结尾留了一个关键的悬念——编译器怎么知道什么时候可以安全地"偷"资源?它需要一种语言机制来区分"这个对象还会被用到"和"这个对象马上就要死了"。这个区分机制,就是左值和右值。 + +说实话,我之前对"左值右值"一直有一种模糊的恐惧感。第一次听到这两个词的时候,脑子里本能的反应就是:"这不就是等号左边和右边吗?"——然后很快就发现事情没这么简单。`const int x = 10;` 里的 `x` 是左值,但你不能对它赋值;`int&& r = 10;` 绑定的明明是个右值,但 `r` 本身又是个左值……这些看起来互相矛盾的现象,折腾了我好一阵子才彻底搞明白。 + +## K&R 的原始定义:等号的左边与右边 + +左值和右值这两个术语可以追溯到 C 语言的诞生。K&R 在《The C Programming Language》里引入了这些概念——"L value" 的 "L" 来源于赋值表达式 `E1 = E2`,其中赋值符号**左边**(Left)的东西必须具有某些特定属性。具体来说,`E1` 必须是一个能被定位的表达式——编译器必须能确定它在内存中的位置,才能把 `E2` 的值写进去。 + +这就是最原始的直觉:**左值 = 可以出现在赋值左边的东西**。 + +拿最简单的例子来说: + +```cpp +int n = 1; // OK: n 是左值,1 是右值 +n = 2; // OK: n 是左值,可以出现在赋值左边 +// 1 = n; // 错误!1 是右值,不能出现在赋值左边 +``` + +`n` 是一个命名变量,它在内存中有一个确定的位置,编译器知道它的地址,所以可以往里面写值。而字面量 `1` 和 `2`——它们就是纯粹的值,编译器不会给它们分配一个你可以写入的内存地址。你没法跟编译器说"请把 n 的值写到数字 1 里面去",因为数字 1 根本没有"里面"这个概念。 + +这是理解左值和右值的第一个层次。在这个层次上,一切看起来都很好——左值就是"有地址、能赋值"的东西,右值就是"没地址、不能赋值"的东西。 + +但等一下——你有没有觉得这个定义隐含了一个假设?它假设"能出现在赋值左边"和"有内存地址"是一回事。在 C 语言的最早期,这个假设基本成立。但 C 很快就引入了 `const`,C++ 又引入了引用、类类型、临时对象……随着语言越来越复杂,这个假设开始站不住脚了。接下来我们就会看到这条裂缝是怎么出现的,以及为什么理解它对移动语义至关重要。 + +## 基本分类:字面量与命名变量 + +在开始修补那些裂缝之前,我们先把最基本的分类搞清楚,因为这些规则从 C 时代到今天都没有变过。 + +**字面量(literals)是右值。** 整型字面量 `3`、浮点字面量 `3.14`、字符字面量 `'a'`、枚举常量——它们都是右值。它们没有内存地址(至少从程序员的角度来说没有),你不能对它们赋值,它们只是"值"本身。 + +**命名变量是左值。** `int n;` 声明了一个变量 `n`,它在内存中有位置,你既能读它也能写它。关键的一点是:左值可以出现在赋值表达式的**任何一边**。`n = 1` 中 `n` 在左边(被写入),`m = n` 中 `n` 在右边(被读取)。但 `n` 在右边时发生了什么?它被读取了——编译器把 `n` 所在内存位置存储的值取出来。这个"读取"操作有一个正式的名字:**左值到右值转换(lvalue-to-rvalue conversion)**。 + +这个转换几乎无处不在,只是我们平时不会意识到。每当你写 `int b = a;` 的时候,`a` 是一个左值,但为了把它赋给 `b`,编译器必须先把 `a` 存储的值读出来——这一步就是左值到右值转换。理解这个转换的存在很重要,因为它解释了一个微妙的事实:**左值和右值不是两种"东西",而是表达式的两种"属性"**。同一个变量 `a` 在不同的上下文中可以表现出左值属性或右值属性。 + +## const 对象:K&R 定义的第一道裂缝 + +现在问题来了。我们看这段代码: + +```cpp +const int max = 100; +// max = 200; // 错误!max 是 const,不能赋值 +printf("&max = %p\n", (void*)&max); // 但 max 有地址! +``` + +`max` 是一个 const 对象。你不能对它赋值——`max = 200` 是编译错误。按照 K&R 的"左值 = 能出现在赋值左边"的定义,`max` 不应该是左值。但实际上,`max` 确实有一个内存地址,你可以取它的指针(`&max` 是合法的),你可以通过指针读取它的值。 + +这就是 K&R 定义的裂缝:**const 对象是左值,但不可赋值**。标准术语管它们叫"不可修改的左值"(non-modifiable lvalue)。 + +这个区分非常重要,因为它揭示了左值概念的真正核心——**有地址**,而不是**能赋值**。一个 `const int` 对象有地址但不可赋值;一个整型字面量 `3` 既没有地址也不可赋值。前者是不可修改的左值,后者是右值。区分它们的关键不是"能不能赋值",而是"有没有一个持久的内存位置"。 + +GCC 16.1.1 的实际运行结果证实了这一点: + +```text +max = 100 +&max = 0x7ffc47a05dc8 +``` + +`&max` 打印出了一个合法的栈地址——这个 const 对象实实在在存在于内存中。 + +这里我们可以做一个对比来加深理解。`const int max = 100;` 中的 `max` 是不可修改的左值:它有地址,你不能赋值,但你能取地址、能通过指针读取。而字面量 `100` 是右值:它没有地址,你也不能赋值。两者的共同点是"不能赋值",但关键的区别在于"有没有持久的内存位置"。这个区别到了类类型和引用绑定的部分会变得非常重要——因为编译器正是根据"有没有持久位置"来决定哪些引用可以绑定到哪些表达式上的。 + +## 类类型的右值:可以调用成员函数 + +左值和右值的区分在类类型上变得更有意思了。考虑一个简单的结构体: + +```cpp +struct Widget +{ + int value; + void f() + { + // this 指向调用对象的地址 + printf("Widget::f(), value = %d, this = %p\n", value, (void*)this); + } +}; +``` + +我们有两种方式可以获得类类型的右值。第一种是函数返回值:一个按值返回 `Widget` 的函数,其返回值就是一个类右值。第二种是函数式强制转换:`Widget(7)` 把整数 7 转换成一个 `Widget` 类型的临时对象,这也是一个类右值。 + +有趣的地方在于:**你可以对类右值调用成员函数**。 + +```cpp +Widget(7).f(); // OK!在临时 Widget 上调用 f() +make_widget(42).f(); // OK!在函数返回的临时对象上调用 f() +``` + +这看起来有点奇怪——右值不是"没有地址"吗?你怎么能在没有地址的东西上调用成员函数?答案是编译器在背后做了一件事:它为这个临时对象在内存中分配了一个位置——标准管这个过程叫**临时实体化转换(temporary materialization conversion)**。`this` 指针就指向那个临时分配的内存位置。 + +我在 GCC 16.1.1 上跑了一下,结果很有意思: + +```text +Widget::f(), value = 7, this = 0x7ffc9a466b04 +Widget::f(), value = 42, this = 0x7ffc9a466b04 +``` + +注意看——两次调用的 `this` 地址完全相同!这是因为编译器做了 NRVO(命名返回值优化),把 `make_widget` 返回的临时对象直接放在了调用方的栈空间里,而 `Widget(7)` 的临时对象恰好也被分配在了同一块区域。这些临时对象虽然生命周期短暂,但它们在存活期间确实拥有真实的内存位置。 + +:::warning 临时实体化的版本来历,这里要分清两件事 +说"右值没有地址"其实不够准确。准确的说法是——右值**不需要**有地址,它不是一个持久的内存位置。但如果编译器为了实现某个操作(比如调用成员函数、绑定到引用)而临时给它分配了一块内存,那它在那一瞬间是"有地址"的。这个由编译器隐式分配内存的过程就是临时实体化。 + +关于它的版本来历,得把两件事拆开看:lvalue / xvalue / prvalue 这个**值类别三分法**确实是 C++11 引入的;但"**临时实体化转换(temporary materialization conversion)**"作为一个有名字的标准转换,是 **C++17** 才正式确立的。它和 C++17 的强制拷贝消除(mandatory copy elision,提案 P0135)一起被写进语言规则,核心思想是:**prvalue 本身不一定是对象,只有在需要它作为对象时(比如调成员函数、绑定到引用),才"物化"成一个临时对象**。C++11 时代这个机制还在酝酿、没有正式命名。所以严格说,上面 `Widget(7).f()` 里的那次临时实体化,是 C++17 起才有的标准语义——别把它和 C++11 的值类别三分法混成一回事。 +::: + +:::warning +类右值可以调用成员函数,这个特性是移动语义的基础。移动构造函数和移动赋值运算符本质上是"在即将消亡的临时对象上调用的成员函数"——通过右值引用,我们获得了对这些临时对象的修改权。 +::: + +## 左值引用:绑定的第一条规则 + +现在我们进入引用的世界。在 C++11 引入右值引用之前,C++ 里说的"引用"就是今天我们正式称为"左值引用"(lvalue reference)的东西。 + +"到 T 的左值引用必须绑定到一个 T 类型的左值"——这句话听起来很绕,但意思很简单。`int&` 类型的引用只能绑定到 `int` 类型的左值上: + +```cpp +int n = 10; +int& ri = n; // OK: ri 绑定到左值 n +// int& ri2 = 10; // 错误!不能把左值引用绑定到右值(字面量) +``` + +`int& ri = 10` 为什么是错误的?因为 `10` 是一个右值,它没有持久的内存位置。引用需要知道自己引用的是什么东西的地址,但右值没有地址——这就矛盾了。 + +但这里有个非常重要的例外:**const 左值引用可以绑定到右值**。 + +```cpp +const int& cri = 10; // OK!const 引用可以绑定到右值 +const int& cri2 = 3.14; // OK!甚至可以绑定到不同类型(double -> int 转换) +``` + +这背后的机制是:编译器悄悄地创建了一个临时的 `int` 对象来存储那个值(或转换后的值),然后让 const 引用绑定到这个临时对象上。对于 `const int& cri2 = 3.14;`,编译器先做了 `double` 到 `int` 的转换(3.14 变成 3),创建一个临时 `int` 存着 3,然后 `cri2` 绑定到这个临时对象。这就是为什么我在 GCC 的输出里看到 `const lvalue ref to converted: 3`——3.14 被截断了。 + +你可能会问:为什么必须是 `const`?因为如果允许非 const 引用绑定到右值,你就可以通过这个引用去修改一个临时对象——而那个临时对象可能马上就被销毁了,修改它毫无意义,反而容易引发 bug。const 引用绑定了临时对象,你只能读它,不能改它,所以是安全的。 + +这条规则还有一个重要的推论:**const 引用延长了临时对象的生命周期**。正常情况下,`Widget(7).f()` 里的临时对象在语句结束后就被销毁了。但如果有一个 const 引用绑定了它,这个临时对象的生命周期会被延长到跟引用一样长。 + +举个具体的例子来说明这有多重要。假设你写了一个返回 `std::string` 的函数,然后用 const 引用来接收它: + +```cpp +std::string get_name() { return "hello"; } + +const std::string& name = get_name(); +// name 在这里仍然有效!临时对象的生命周期被延长了 +printf("%s\n", name.c_str()); // 安全 +``` + +如果没有 const 引用的生命周期延长规则,`get_name()` 返回的临时 `std::string` 在语句结束后就会被销毁,`name` 就会变成一个悬垂引用。但因为 `const std::string&` 绑定了这个临时对象,编译器保证临时对象至少活到 `name` 离开作用域的时候。 + +不过这里有个微妙的坑——只有"第一个"直接绑定到临时对象的引用才能延长它的生命周期,通过引用链间接绑定的不行。比如 `const std::string& r2 = name;` 中 `r2` 绑定到 `name`(一个左值),不涉及临时对象,所以没有生命周期延长的问题。但如果涉及多层间接绑定临时对象的情况,就要小心了。我们在 vol2 的 [右值引用:从拷贝到移动](../../../vol2-modern-features/ch00-move-semantics/01-rvalue-reference.md) 里有更详细的讨论。 + +:::warning +注意:右值引用 `T&&` 同样具有延长临时对象生命周期的效果。`std::string&& r = get_name();` 也会让返回的临时对象活到 `r` 离开作用域。这是右值引用和 const 左值引用的一个共同点——它们都能绑定到临时对象并延长其生命周期。区别在于,右值引用允许你修改这个临时对象,而 const 左值引用不允许。 +::: + +## 右值引用:为移动语义而生 + +C++11 引入了一种新的引用类型——右值引用,用双 `&&` 语法表示。 + +```cpp +int&& ri = 10; // OK: 右值引用绑定到右值(字面量 10) +// int&& ri2 = n; // 错误!右值引用不能绑定到左值 +``` + +右值引用的绑定规则正好和左值引用"反过来":`int&&` 只能绑定到 `int` 类型的右值上。`int&& ri2 = n` 是编译错误,因为 `n` 是左值。 + +:::warning +即使是 `const int&&` 也只能绑定到右值——给右值引用加 const 不会让它突然能绑定到左值。这一点经常被搞混。const 右值引用在实践中几乎见不到,标准库几乎没有使用它的场景,但它确实存在。 +::: + +右值引用到底有什么用?关键在这一点:**通过右值引用,我们可以修改临时对象**。 + +```cpp +int&& ri = 10; // 编译器为字面量 10 创建一个临时 int 对象 +ri = 20; // OK!我们修改了这个临时对象 +``` + +对于 `int` 这种简单类型,这没什么实际意义。但当我们讨论类类型的时候——想象一下 `MyString&&`,它绑定到一个临时的 `MyString` 对象上,而那个临时对象内部有一块动态分配的字符数组。通过这个右值引用,我们可以直接把那块数组的指针"偷"过来,把临时对象的指针设为 `nullptr`,然后让临时对象析构时什么都不做。 + +这正是移动构造函数和移动赋值运算符的签名所表达的:它们通过右值引用接收参数,告诉编译器"我知道这是一个临时对象,我可以安全地偷它的资源"。但这部分是下一篇的内容了,我们先继续把引用体系搞完整。 + +你可能还会问一个更根本的问题:为什么 C++11 要引入一种全新的引用类型来做这件事?为什么不直接复用左值引用?答案是:如果移动构造函数的签名是 `MyString(MyString& s)`,那它就跟拷贝构造函数 `MyString(const MyString& s)` 形成了歧义——不对,实际上不会歧义,因为 const 不同。但真正的问题是:如果有一个函数既接受 `MyString&` 又接受 `const MyString&`,编译器在看到 `s1 + s2`(一个右值)时,找不到匹配的非 const 左值引用来绑定它,所以仍然无法触发"移动"。右值引用填补了这个空缺:它专门用来绑定到右值,跟左值引用的绑定规则互不重叠,这样重载解析就能自动区分"这是一个持久的对象(拷贝它)"和"这是一个临时对象(偷它的资源)"。 + +## C++11 的值类别体系:lvalue、xvalue、prvalue + +到目前为止我一直在说"左值"和"右值"这两个类别,好像整个世界非黑即白。但实际上,C++11 为了支持移动语义,把值类别(value category)的体系从二元扩展成了三元。 + +在 C++11 之前,每个表达式要么是 lvalue 要么是 rvalue——就这么简单。但 C++11 引入了第三种类别:**xvalue(expiring value,将亡值)**。xvalue 表示"这个对象即将消亡,它的资源可以被移走"。 + +新的分类体系是这样的。首先,所有表达式按"具有身份"(identity,可以确定内存位置)和"可以被移动"两个维度来划分: + +| 类别 | 具有身份 | 可以被移动 | 示例 | +|------|:--------:|:----------:|------| +| **lvalue** | 是 | 否 | 命名变量 `n`、`*p`、`++i` | +| **xvalue** | 是 | 是 | `std::move(n)` 的结果 | +| **prvalue** | 否 | 是 | 字面量 `42`、`Widget(7)`、函数返回的临时对象 | + +然后还有两个组合概念:**glvalue**(generalized lvalue)= lvalue + xvalue,**rvalue** = xvalue + prvalue。用一张图来表示: + +```text + 表达式 + / \ + glvalue rvalue + / \ / \ + lvalue xvalue prvalue +``` + +- **lvalue**:有身份,不能被移动——普通的命名变量。 +- **xvalue**:有身份,可以被移动——`std::move(x)` 的返回值。它有名字(或者说有确定的内存位置),但编译器被告知"你可以移走它的资源"。 +- **prvalue**(pure rvalue):没有身份,可以被移动——纯粹的临时值,比如字面量和函数返回的临时对象。 + +这个体系看起来比二元分类复杂了不少,但它的设计逻辑很清晰:移动语义需要一种机制来表达"这个东西的资源可以被偷走",而 xvalue 就是那个桥梁。`std::move` 本质上做的事情就是把一个 lvalue 转换成 xvalue,告诉编译器"这个对象虽然还有名字,但你可以移走它的资源"。 + +### 常见表达式的值类别 + +光看定义可能还是有点抽象,我们把日常写代码时最常见的表达式列出来,标明它们分别属于哪个类别: + +| 表达式 | 值类别 | 原因 | +|--------|--------|------| +| `n`(命名变量) | lvalue | 有名字,有确定的内存位置 | +| `*p`(解引用) | lvalue | 指针指向的对象有内存位置 | +| `++i`(前置自增) | lvalue | 返回修改后的 `i` 本身 | +| `i++`(后置自增) | prvalue | 返回的是旧值的副本,是一个临时值 | +| `42`(整型字面量) | prvalue | 没有内存位置的纯值 | +| `"hello"`(字符串字面量) | lvalue | 字符串字面量是 const char 数组,有地址 | +| `Widget(7)`(函数式转换) | prvalue | 创建一个临时的 Widget 对象 | +| `make_widget()`(按值返回) | prvalue | 函数返回的临时值 | +| `std::move(n)` | xvalue | 把 lvalue 显式转为"可移动"状态 | +| `a.m`(成员访问,a 是 lvalue) | lvalue | 跟随 `a` 的身份属性 | +| `std::move(a).m`(成员访问,a 是 xvalue) | xvalue | 跟随 `a` 的 xvalue 属性 | + +有几个值得特别留意的点。字符串字面量 `"hello"` 是左值,这个经常让人意外——它实际上是 `const char[6]` 类型的数组,存储在程序的只读数据段,有确定的地址,所以是 lvalue。后置 `++` 返回的是旧值的副本(一个临时值),所以是 prvalue;而前置 `++` 返回的是修改后的对象本身,所以是 lvalue。成员访问表达式 `a.m` 的值类别跟 `a` 的值类别保持一致——如果 `a` 是 lvalue,`a.m` 就是 lvalue;如果 `a` 是 xvalue,`a.m` 就是 xvalue。 + +## 用编译器验证值类别 + +理论说了一堆,我们用 `decltype` 和类型特征来实际验证一下。`decltype` 有一个很有用的特性:如果作用于一个**带括号**的变量名 `decltype((x))`,它会根据表达式的值类别给出不同的类型——lvalue 给出 `T&`,xvalue 给出 `T&&`,prvalue 给出 `T`。 + +```cpp +#include +#include +#include + +template +void print_category() +{ + printf(" is lvalue ref: %s\n", + std::is_lvalue_reference_v ? "yes" : "no"); + printf(" is rvalue ref: %s\n", + std::is_rvalue_reference_v ? "yes" : "no"); +} + +int main() +{ + int n = 10; + + printf("decltype((n)):\n"); // n 是 lvalue + print_category(); // int& → lvalue ref: yes + + printf("decltype(10):\n"); // 10 是 prvalue + print_category(); // int → 都不是引用 + + printf("decltype(std::move(n)):\n"); // std::move(n) 是 xvalue + print_category(); // int&& → rvalue ref: yes + + return 0; +} +``` + +GCC 16.1.1 的输出完美印证了理论: + +```text +decltype((n)): + is lvalue ref: yes + is rvalue ref: no +decltype(10): + is lvalue ref: no + is rvalue ref: no +decltype(std::move(n)): + is lvalue ref: no + is rvalue ref: yes +``` + +`decltype((n))` 得到 `int&`,因为 `(n)` 是一个 lvalue 表达式。`decltype(10)` 得到 `int`(裸类型),因为 `10` 是 prvalue。`decltype(std::move(n))` 得到 `int&&`,因为 `std::move` 的返回值是 xvalue,而 xvalue 在 `decltype` 中表现为 `T&&`。 + +## "有名字就是左值"——右值引用参数的陷阱 + +到这里我们该聊一个几乎所有 C++ 新手都会踩的坑了。Ben Saks 在演讲中特别强调了这条规则:**如果一个东西有名字,那它就是一个左值**。 + +考虑一个接收右值引用的函数: + +```cpp +void process(MyString&& s) +{ + // 在这里,s 是左值还是右值? +} +``` + +从函数外部来看,你调用 `process(s1 + s2)` 时,`s1 + s2` 是一个右值,所以这个调用没问题——右值引用可以绑定到右值。但在函数**内部**,参数 `s` 有名字。它是一个命名对象。根据"有名字就是左值"的规则,**在函数体内,`s` 被当作左值处理**。 + +这意味着什么?如果你在函数体内想再次从 `s` 移动资源,你不能直接移动——编译器会把 `s` 当作左值,选择拷贝而不是移动。你必须显式地使用 `std::move(s)` 来告诉编译器"我知道我在做什么,请把它当作右值"。 + +```cpp +void process(MyString&& s) +{ + MyString copy(s); // 拷贝!因为 s 在这里是左值 + MyString moved(std::move(s)); // 移动!std::move 把 s 转为右值 +} +``` + +这个规则背后的逻辑其实很合理:函数体可能有很多行代码,`s` 在第一行被移动之后可能还会在第十行被使用。编译器不能假设"你只在最后一行使用它",所以它选择保守策略——有名字的东西就不自动移动,你必须显式授权。 + +:::tip +这个"名字 = 左值"的规则可以用 `decltype` 来验证。如果你在一个函数模板里写 `decltype((s))`,当 `s` 的声明类型是 `MyString&&` 时,`decltype((s))` 仍然会给出 `MyString&`(左值引用),而不是 `MyString&&`。因为带括号的 `decltype` 看的是表达式的值类别,而 `s` 作为一个命名对象,其值类别是 lvalue。这一点经常被用来在面试题里挖坑。 +::: + +:::tip +这条"有名字就是左值"的规则有一个重要例外:**return 语句**。`return s;` 中的 `s` 虽然有名字,但在 C++11 起它被视为"隐式可移动的"(implicitly movable entity),编译器可以直接对它做移动而不需要你写 `std::move(s)`。而且实际上编译器可能做得更好——通过 NRVO 直接消除拷贝。关于这个话题的完整讨论,我们留到下一篇。 +::: + +## 引用绑定规则速查表 + +我们把这一篇涉及到的所有引用绑定规则整理成一张表,方便查阅: + +| 引用类型 | 能绑定到 lvalue? | 能绑定到 rvalue? | 能绑定到不同类型? | 能修改被引用对象? | +|----------|:-----------------:|:-----------------:|:------------------:|:-----------------:| +| `T&` | 是 | **否** | 否 | 是 | +| `const T&` | 是 | **是** | 是(带转换) | 否 | +| `T&&` | **否** | 是 | 否 | 是 | +| `const T&&` | **否** | 是 | 否 | 否 | + +这张表的信息量不小,但有几个关键结论值得特别记住。第一,`const T&` 是"万能接收器"——它能绑定到几乎任何东西(lvalue、rvalue、甚至不同类型),代价是你不能通过它修改被引用的对象。第二,`T&&` 只绑定到右值,这正是移动语义需要的:它保证了绑定到的一定是一个"可以安全偷取资源"的对象。第三,`const T&&` 虽然存在,但几乎没什么用——它既能绑定到右值又不能修改,这就失去了右值引用"允许修改临时对象"的核心优势。 + +## 到这里我们搞清楚了什么 + +这一篇我们从 K&R 的"等号左边"出发,一步步构建了 C++ 值类别的完整图景。我们看到了 const 对象如何打破"左值 = 可赋值"的旧定义,看到了类右值如何通过临时实体化获得内存位置,看到了左值引用和右值引用截然不同的绑定规则,最后在 C++11 的 lvalue/xvalue/prvalue 三元体系中找到了移动语义的理论基础。 + +核心的收获是两条:第一,右值引用 `T&&` 只绑定到右值,这给编译器提供了一个天然的信号——"绑定到的东西是临时的,可以安全地偷走它的资源"。第二,"有名字就是左值"的规则意味着我们有时需要 `std::move` 来显式地告诉编译器"请允许移动"。 + +回头看,左值和右值的区分并不是 C++11 凭空发明的——它从 C 语言时代就存在,只是当时简单得多。C++ 引入了 const、类类型、引用、运算符重载这些特性,每一步都让值类别的边界变得更加模糊,直到移动语义需要一个精确的机制来区分"持久的"和"临时的"对象,C++11 才终于把这套体系正式化成了 lvalue/xvalue/prvalue 的三级分类。理解了这套体系的演进逻辑,后面学 `std::move`、移动构造函数、完美转发这些概念就会顺畅很多——因为它们的设计都是在响应同一个问题:"编译器怎么知道这个对象能不能被安全地移动?" + +有了这些理论基础,下一篇我们就可以进入实战了——为 MyString 实现移动构造函数和移动赋值运算符,看看 `std::move` 到底是怎么工作的,以及拷贝消除(copy elision)在什么条件下可以让我们连移动都不需要。 + +如果你想要一个更系统化的右值引用讲解,vol2 的 [右值引用:从拷贝到移动](../../../vol2-modern-features/ch00-move-semantics/01-rvalue-reference.md) 是很好的补充材料。 + + + + + + + + + diff --git a/documents/vol10-open-lecture-notes/cppcon/2025/04-back-to-basics-move-semantics/03-move-ops-stdmove-and-elision.md b/documents/vol10-open-lecture-notes/cppcon/2025/04-back-to-basics-move-semantics/03-move-ops-stdmove-and-elision.md new file mode 100644 index 000000000..fd1b54836 --- /dev/null +++ b/documents/vol10-open-lecture-notes/cppcon/2025/04-back-to-basics-move-semantics/03-move-ops-stdmove-and-elision.md @@ -0,0 +1,581 @@ +--- +title: "移动操作、std::move 与拷贝消除" +description: "CppCon 2025 演讲笔记 —— 移动构造/赋值的完整实现、std::move 的真实含义、NRVO 与 C++17 强制拷贝消除、moved-from 状态" +conference: cppcon +conference_year: 2025 +talk_title: "Back to Basics: Move Semantics" +speaker: "Ben Saks" +video_bilibili: "https://www.bilibili.com/video/BV1X54y1P7uM" +video_youtube: "https://www.youtube.com/watch?v=szU5b972F7E" +tags: + - cpp-modern + - host + - beginner +difficulty: beginner +platform: host +cpp_standard: [11, 17, 20] +chapter: 4 +order: 3 +--- + +# 移动操作、std::move 与拷贝消除 + +:::tip +本文是 CppCon 2025 "Back to Basics: Move Semantics" 系列笔记的第三篇。前两篇分别讨论了拷贝开销与移动动机、左值右值与引用体系。本篇聚焦于实战层面的核心问题:怎么写移动构造和移动赋值、`std::move` 到底在做什么、以及 C++17 的拷贝消除如何改变游戏规则。 +::: + +说实话,我之前一直觉得自己"理解"移动语义——不就是偷指针嘛,有什么难的?直到有一天我在 code review 里看到同事写了一句 `return std::move(result);`,我顺嘴说了句"挺好,显式移动了",然后被旁边的资深工程师一句话打脸:**"你确定这样写不会阻止 NRVO?"** + +折腾了一晚上才搞明白——`return std::move(result)` 不仅不会帮你优化,反而会把编译器本可以零成本完成的返回值传递变成一次额外的移动构造。从那天起我才真正意识到,移动语义的魔鬼全在细节里。 + +这一篇我们就来把这些细节一个一个拆清楚。我们的实验环境是 Arch Linux WSL,GCC 16.1.1,编译选项 `-std=c++20`,如果你打算跟着跑代码,建议备好这个版本或者更新的编译器。 + +## 移动构造函数:偷指针的艺术 + +上一篇我们已经有了完整的 `MyString` 拷贝操作。现在给它加上移动构造函数。这个函数做的事情,用 Ben Saks 的话说,是一种**"破坏性拷贝"(destructive copy)**——我们把源对象的数据"偷"过来,然后让源对象进入一种无害的状态。 + +```cpp +class MyString +{ + std::size_t stored_length_; + char* actual_str_; + +public: + // ... 之前的构造函数、析构函数、拷贝操作 ... + + // 移动构造函数 + MyString(MyString&& s) noexcept + : stored_length_(s.stored_length_) + , actual_str_(s.actual_str_) + { + s.actual_str_ = nullptr; + s.stored_length_ = 0; + } +}; +``` + +我们来逐行拆解这段代码到底在做什么,因为每一行都有它存在的理由。 + +首先是参数类型 `MyString&& s`——这是一个右值引用。右值引用只能绑定到右值(临时对象、`std::move` 的结果等),这意味着只有当编译器确认"源对象即将消亡"时,才会调用这个构造函数。这就是移动语义安全性保障的第一层:编译器通过重载决议来帮你把关。 + +接下来是初始化列表。`stored_length_(s.stored_length_)` 把源对象的长度直接拿过来——`std::size_t` 是内置类型,所谓"拷贝"就是一次整数赋值,几乎零成本。`actual_str_(s.actual_str_)` 才是关键:我们把源对象的指针直接赋给新对象,新对象现在指向了源对象之前分配的那块堆内存。到目前为止,两个对象指向同一块内存——如果我们就这样结束,那就是 double delete,是未定义行为。 + +所以函数体里那两行才是灵魂。`s.actual_str_ = nullptr` 把源对象的指针置空,`s.stored_length_ = 0` 把长度归零。这样一来,源对象的析构函数执行 `delete[] actual_str_` 时,实际调用的是 `delete[] nullptr`——而标准明确规定,删除空指针是安全的无操作。 + +你可能注意到了,移动构造函数的参数 `s` 虽然是右值引用,但 `s` 的析构函数仍然会被调用。这是很多人忽略的一点:移动操作并不是"接管之后就不用管源对象了"。恰恰相反,源对象在移动完成之后还是一个完整、合法的对象——只不过它的内部状态被我们故意设成了"无害"的值。它仍然会被正常析构,只不过析构时什么也不会释放。 + +## 重载决议:编译器怎么选? + +有了拷贝构造和移动构造两个版本之后,编译器面对初始化表达式时会怎么选?答案是根据实参的值类别来重载决议。 + +```cpp +MyString s1("hello"); + +// s1 是左值(有名字)→ 调用拷贝构造函数 +MyString s2(s1); + +// std::move(s1) 是右值 → 调用移动构造函数 +MyString s3(std::move(s1)); +``` + +第一行 `MyString s2(s1)` 中,`s1` 是一个左值——它有名字,你可以对它取地址。编译器看到实参是左值,去找能接受 `const MyString&` 的构造函数,命中拷贝构造。 + +第二行 `MyString s3(std::move(s1))` 中,`std::move(s1)` 的结果是右值引用,编译器去找能接受 `MyString&&` 的构造函数,命中移动构造。这就是为什么我们需要两种构造函数并存:拷贝构造处理"源对象还要继续用"的情况,移动构造处理"源对象反正要死了"的情况。 + +Ben Saks 在演讲里特别强调了一点:**右值引用本身并不执行移动**。它只是在类型系统层面给编译器提供了一个信号——"这个引用绑定到了一个右值上"。真正决定是拷贝还是移动的,是重载决议。如果我们的 `MyString` 没有移动构造函数,那 `std::move(s1)` 也只会触发拷贝构造——编译器会退而求其次,使用 `const MyString&` 版本,因为 `MyString&&` 可以被 `const MyString&` 接收。不会报错,但也不会移动。这一点后面还会再提到。 + +## 移动赋值运算符:老对象要先清理 + +移动构造处理的是"创建新对象"的场景,而移动赋值处理的是"覆盖已有对象"的场景。两者的核心逻辑很像,但移动赋值多了一步——要先清理目标对象的旧资源。 + +```cpp +MyString& operator=(MyString&& s) noexcept +{ + if (this != &s) { + delete[] actual_str_; // 第一步:清理自己的旧资源 + stored_length_ = s.stored_length_; + actual_str_ = s.actual_str_; // 第二步:偷源对象的资源 + s.actual_str_ = nullptr; // 第三步:置空源对象 + s.stored_length_ = 0; + } + return *this; +} +``` + +这个顺序很重要。我们先 `delete[] actual_str_` 释放自己之前的堆内存,然后再接管源对象的指针。如果我们反过来——先赋值再 delete——那就把源对象给我们的指针给删了,这是一个典型的 use-after-free。 + +自赋值检查 `if (this != &s)` 在移动赋值中同样重要。虽然 `s` 是右值引用,理论上不应该有人写 `x = std::move(x)` 这种代码,但语言层面并不禁止,而且有时候模板实例化后可能会产生这种效果。没有自赋值检查的话,`delete[] actual_str_` 会把我们自己的内存释放掉,然后 `actual_str_ = s.actual_str_` 把一个悬空指针赋回给自己——直接炸。 + +注意返回类型是 `MyString&`——左值引用,不是右值引用。这是因为赋值运算符的目标(`=` 左边的对象)始终是左值。无论你用不用 `std::move`,赋值的接收端一定是"一个有名字、有地址的对象"。 + +另外,这个实现在异常安全方面是安全的——`MyString` 的数据成员只有内置类型(`std::size_t` 和 `char*`),对这些类型的操作不会抛异常。这也是为什么我给它标了 `noexcept`。如果你的类有更复杂的数据成员(比如另一个 `std::string`),那就得仔细考虑异常安全了。 + +## std::move:C++ 中被误解最深的函数 + +`std::move` 这个名字起得实在太坑了。我第一次看到它的时候,理所当然地以为它"执行移动操作"——毕竟它叫 "move" 嘛。但事实是,**`std::move` 本身不移动任何东西**。 + +它的真实身份是一个 `static_cast` 到右值引用的类型转换。标准库的实现大致等价于: + +```cpp +template +constexpr typename std::remove_reference::type&& move(T&& t) noexcept +{ + return static_cast::type&&>(t); +} +``` + +去掉 `remove_reference` 的模板体操不看,核心就是 `static_cast(t)`。它把传入的参数转换成右值引用然后返回。仅此而已。它不生成任何移动代码,不调用任何移动构造函数,不修改任何对象的状态。 + +Ben Saks 在演讲里说了一句大实话:**如果我们能重新来过,大概会把它叫成 `make_movable` 或 `as_rvalue`**。这个名字至少不会让人误以为它在执行移动。 + +### 为什么需要 std::move:swap 中的命名陷阱 + +那既然 `std::move` 不移动,我们为什么还需要它?来看 `swap` 函数。这是最能说明问题的场景。 + +```cpp +template +void swap(T& x, T& y) +{ + T temp(x); // (1) + x = y; // (2) + y = temp; // (3) +} +``` + +这个 C++03 风格的 `swap` 执行三次拷贝。我们当然想把它改成移动版本——毕竟我们前两篇文章一直在说移动比拷贝快得多。但是问题来了:函数体内的 `x`、`y`、`temp` 全都是左值。它们都有名字,你可以对它们取地址,它们的生命周期跨越多条语句。编译器不可能自动把它们当成右值来处理——万一你在第三行之后还要用 `temp` 呢? + +C++ 有一个一般性的规则:**如果一个东西有名字,它就是左值**。只有没有名字的东西(比如临时对象、字面量、函数返回的按值结果)才能是右值。这个规则非常合理——编译器必须保守,它不能假设 `temp` 在下一行不被使用。 + +所以我们需要显式地告诉编译器:"我知道 `temp` 之后不会再被使用了,请把它当作右值来处理。"这正是 `std::move` 的用途: + +```cpp +template +void move_swap(T& x, T& y) +{ + T temp(std::move(x)); // 移动构造 temp + x = std::move(y); // 移动赋值 x + y = std::move(temp); // 移动赋值 y +} +``` + +每一个 `std::move` 都是在向编译器传递一个信息:**"这个地方,我确认可以安全地从该对象移动资源。"** 编译器拿到这个信息后,才会在重载决议中选择移动版本。 + +### std::move 不保证移动 + +还有一个容易忽略的陷阱:`std::move` 并不保证一定会发生移动。如果一个类型只有拷贝操作而没有移动操作,那 `std::move` 的结果会退化为拷贝。 + +```cpp +struct CopyOnly +{ + CopyOnly() = default; + CopyOnly(const CopyOnly&) { std::cout << "copy\n"; } + // 没有移动构造函数! +}; + +CopyOnly a; +CopyOnly b(std::move(a)); // 输出 "copy" —— 退化为拷贝构造 +``` + +这里 `std::move(a)` 把 `a` 转成了右值引用,但 `CopyOnly` 没有接受右值引用的构造函数。编译器退而求其次,使用 `const CopyOnly&` 版本的拷贝构造函数(因为 `CopyOnly&&` 可以绑定到 `const CopyOnly&`)。不会报错,只是你期望的"移动"变成了"拷贝"——而且是悄无声息的。 + +## 右值引用参数的命名悖论 + +这是移动语义中最让人困惑的一点,也是 Ben Saks 花了不少时间强调的内容。 + +当我们写一个接收右值引用参数的函数时,参数在函数内部**被当作左值**来处理: + +```cpp +void process(MyString&& s) +{ + // s 有名字 → s 是左值 + MyString copy(s); // 调用拷贝构造!不是移动构造! + MyString moved(std::move(s)); // 这才调用移动构造 +} +``` + +从函数外部的视角来看,传入的实参是一个右值(比如 `process(std::move(x))` 或者 `process(MyString("temp"))`)。但一旦进入函数体,`s` 就是一个有名字的变量了——它跨越多条语句而存在,编译器不可能假设它只使用一次。所以"有名字就是左值"的规则依然生效。 + +这导致一个实用的后果:**在函数内部,如果你要从右值引用参数移动资源,你必须显式使用 `std::move`**。而且一旦你移动了,这个参数在后续代码中的值就不可预测了——这是下一个小节要讨论的 moved-from 状态。 + +## 隐式可移动的返回表达式 + +好消息是,"有名字就是左值"这条规则有一个重要的例外——`return` 语句。 + +```cpp +MyString make_greeting() +{ + MyString temp("hello world"); + // ... 对 temp 做一些操作 ... + return temp; // 不需要 std::move! +} +``` + +在这段代码中,`temp` 虽然有名字(按理说是左值),但 `return temp;` 是函数中对 `temp` 的最后一次使用。编译器知道 `temp` 的生命周期在函数返回后即刻结束,所以标准允许它把 `temp` 当作隐式可移动的对象(implicitly movable entity)来处理。 + +这意味着你**不需要**写 `return std::move(temp);`。直接 `return temp;` 就够了——编译器会自动选择移动构造(或者更好的选择,直接消除这次构造,下面马上讲到)。 + +## NRVO:比移动更牛的优化 + +说"隐式可移动"其实还没讲到头。编译器实际上可以做得比移动更好——它可以让返回值**零成本**地到达调用方,连移动都不需要。这就是所谓的**命名返回值优化(Named Return Value Optimization, NRVO)**。 + +```cpp +MyString make_greeting() +{ + MyString temp("hello world"); + return temp; +} + +MyString s = make_greeting(); +``` + +在没有 NRVO 的世界里,执行流程是这样的:先在 `make_greeting` 的栈帧上构造 `temp`,然后在 `s` 的位置构造一个临时对象(通过移动或拷贝),然后 `temp` 析构,然后临时对象再移动或拷贝到 `s`,然后临时对象析构。听着就很浪费。 + +NRVO 的思路非常巧妙:编译器在生成代码时,直接把 `temp` 构造在 `s` 的位置上。不是先构造再拷贝,而是从一开始就放在正确的位置。`temp` 就是 `s`,它们共享同一块内存。函数返回时,不需要任何拷贝或移动——对象本来就在该在的地方。 + +从 C++17 开始,这种优化在某些场景下变成了**强制性**的——编译器必须消除拷贝,而不是"可以消除但也可以不消除"。这不是一个可选的优化,而是语言的定义行为。历史原因让它还叫"优化",但实际上它已经是一种保证了。 + +关于 NRVO 和 RVO 的完整技术细节,我们之前在 vol2 有专门的文章讲解:[RVO 与 NRVO:编译器的返回值优化](../../../vol2-modern-features/ch00-move-semantics/03-rvo-nrvo.md)。 + +## 千万别对返回值用 std::move + +这大概是我见过的移动语义相关最常见的错误。前面说了 `return temp;` 是隐式可移动的,编译器要么做 NRVO(零成本),要么自动退回到移动构造(一次指针赋值的成本)。那有人会想:既然 `std::move` 是"请求移动",那 `return std::move(temp);` 岂不是更明确、更安全? + +**完全相反。** + +```cpp +// 正确写法:允许 NRVO +MyString make_good() +{ + MyString temp("good"); + return temp; +} + +// 错误写法:阻止 NRVO! +MyString make_bad() +{ + MyString temp("bad"); + return std::move(temp); // 反而更慢! +} +``` + +原因在于 NRVO 的触发条件:`return` 表达式必须是一个局部对象的名字。当你写 `return std::move(temp);` 时,返回表达式不再是 `temp` 这个名字了——它是 `std::move(temp)`,一个函数调用表达式。编译器无法对这个表达式执行 NRVO,只能退而求其次选择移动构造。 + +换句话说,`return std::move(temp);` 强制编译器走移动构造路径,而 `return temp;` 让编译器有机会走 NRVO 路径(零成本)。这就是为什么 Ben Saks 在演讲里反复强调:**不要对返回值使用 `std::move`**。 + +我们可以用 `-fno-elide-constructors` 这个编译器标志来对比两者的差异。这个标志会关闭 GCC 的拷贝消除优化,让我们看到"如果没有 NRVO"的世界是什么样的。 + +先看关闭消除之后 `return temp;` 的行为——它会退回到移动构造,因为 `temp` 是隐式可移动的。而 `return std::move(temp);` 同样是移动构造——两者在关闭消除后没有区别。但一旦开启消除(也就是默认行为),`return temp;` 就变成了零操作,而 `return std::move(temp);` 仍然是移动构造。差距就在这里。 + +我用 GCC 16.1.1 实测了一下,给 `MyString` 的各种构造函数加上打印日志后,对比结果是这样的: + +```bash +# 默认开启 NRVO +$ g++ -std=c++20 -O2 test.cpp && ./a.out +=== return temp; (NRVO) === + 构造: "hello" # 只有这一次构造,没有移动,没有拷贝 + +=== return std::move(temp); === + 构造: "hello" + 移动构造: "hello" # 多了一次移动构造! + 析构: "(null)" +``` + +你看,`return std::move(temp);` 明确多了一次移动构造。对于 `MyString` 这种只有指针和整数的类,移动构造的代价很低(一次指针赋值),但对于更复杂的类(比如含多个动态容器的对象),这个额外移动的代价就不能忽略了。 + +```bash +# 关闭 NRVO 后对比 +$ g++ -std=c++20 -O2 -fno-elide-constructors test.cpp && ./a.out +=== return temp; === + 构造: "hello" + 移动构造: "hello" # 没有 NRVO,退回到移动构造 + 析构: "(null)" + +=== return std::move(temp); === + 构造: "hello" + 移动构造: "hello" # 同样是移动构造 + 析构: "(null)" +``` + +关闭 NRVO 后两者确实行为一致——都是一次移动构造。但这恰恰说明了 `return std::move(temp);` 在默认情况下白白浪费了 NRVO 的机会。 + +:::warning C++20/C++23 进一步扩大了「隐式可移动」的范围 +本节讲的「别对返回值用 `std::move`」这条规则,在**所有标准版本(C++11 到 C++26)都成立**,是绝对安全的建议。但「隐式可移动」这套机制本身在后续标准里是被持续加强的,值得知道一下:C++11 引入了最初的隐式移动(return 一个局部对象时编译器可按移动处理);C++20(提案 P1825「More implicit moves」)把「隐式可移动实体」的范围扩大了——比如绑定到右值引用的局部变量、以及 `throw` 一个局部对象,也纳入了隐式移动;C++23(提案 P2266)又做了进一步精细化,让返回值在某些场景下被当作 xvalue 处理,覆盖更多构造路径。 + +但无论这些扩展怎么变,**「return 局部对象时不要写 `std::move`」这条铁律从未改变**——P1825/P2266 扩大的是「编译器能自动移动」的范围,而 `std::move` 反而会破坏 NRVO 的触发条件。结论照旧:写 `return temp;`,把 NRVO 还是隐式移动的选择权交给编译器。 +::: + +## moved-from 状态:有效但不可知 + +移动操作完成之后,源对象处于一种标准称为**"有效但未指定的状态"(valid but unspecified state)**的状态。这几个字值得逐个拆解。 + +"有效"意味着:不会内存泄漏、不会资源泄漏、不会触发未定义行为。你可以安全地让这个对象析构——它的析构函数会正常执行,不会 double free,不会 crash。对于我们的 `MyString` 来说,移动后 `actual_str_` 被置成了 `nullptr`,`stored_length_` 变成了 0,所以析构时 `delete[] nullptr` 什么也不做。 + +"未指定"意味着:你不能对移动后对象持有的值做任何假设。标准没有规定移动后的 `std::string` 一定是空字符串,也没有规定移动后的 `std::vector` 一定是空的。不同的标准库实现可能有不同的行为。我们自己的 `MyString` 在移动后 `c_str()` 返回 `"(null)"`(这是我们自己的安全兜底),但 `std::string` 移动后可能返回空串,也可能返回原始值——你不能依赖它。 + +```cpp +MyString a("hello"); +MyString b(std::move(a)); + +// 安全操作: +// 1. 析构 —— 永远安全 +// 2. 赋新值 —— 永远安全 +a = MyString("new value"); // OK + +// 不安全操作: +// 1. 假设 a 仍持有 "hello" +// 2. 假设 a.size() 是 0 +// 3. 假设 a.c_str() 返回空串 +// 这些假设在某些实现上可能碰巧成立,但标准不保证 +``` + +:::warning moved-from 对象的使用限制 +Ben Saks 在 Q&A 环节被问到"移动后的对象能不能继续用",他的回答非常干脆:**移动后,你对源对象唯一应该做的事情就是给它赋一个新值,或者让它析构**。任何其他操作(读取值、比较、传递给其他函数)都是在赌博——你可能赢了(碰巧实现给了你一个可预测的值),也可能输了(实现变了或者换了个标准库)。不要赌。 + +不要混淆"有效"和"有用"——moved-from 的对象是一个合法的对象,但不是一个内容确定的对象。如果你需要一个空对象,显式创建一个;如果你需要某个特定值,显式赋值。不要指望移动操作帮你做这些。 +::: + +## noexcept 的重要性:vector 扩容的隐藏陷阱 + +最后来说一个在实际工程中经常被忽略但影响巨大的问题:**移动构造函数应该是 `noexcept` 的**。 + +为什么?来看 `std::vector` 扩容的场景。当 `vector` 的容量不够时,它需要分配一块更大的内存,然后把旧元素转移到新内存中。如果元素的移动构造函数是 `noexcept` 的,`vector` 就会使用移动来转移——非常快。如果移动构造函数不是 `noexcept` 的,`vector` 会退回到拷贝。 + +这是因为 `vector` 要提供强异常安全保证(strong exception safety guarantee):如果扩容过程中抛了异常,`vector` 的状态必须回滚到扩容之前。如果用的是移动,一旦中途抛异常,已经被移动的元素没法恢复(它们的资源已经被偷走了)。如果用的是拷贝,原始数据还在,可以安全回滚。 + +我们写个简单的测试来验证这个行为: + +```cpp +#include +#include +#include + +class StringNoNoexcept +{ + std::size_t len_; + char* str_; + +public: + StringNoNoexcept(const char* s) + : len_(std::strlen(s)) + , str_(new char[len_ + 1]) + { + std::memcpy(str_, s, len_ + 1); + std::cout << " ctor: " << str_ << "\n"; + } + + ~StringNoNoexcept() + { + delete[] str_; + } + + StringNoNoexcept(const StringNoNoexcept& o) + : len_(o.len_) + , str_(new char[o.len_ + 1]) + { + std::memcpy(str_, o.str_, len_ + 1); + std::cout << " COPY ctor: " << str_ << "\n"; + } + + // 没有 noexcept! + StringNoNoexcept(StringNoNoexcept&& o) + : len_(o.len_) + , str_(o.str_) + { + o.str_ = nullptr; + o.len_ = 0; + std::cout << " MOVE ctor: " << (str_ ? str_ : "(null)") << "\n"; + } + + const char* c_str() const { return str_ ? str_ : "(null)"; } +}; + +int main() +{ + std::vector vec; + vec.reserve(2); + + std::cout << "=== push 3 elements (triggers reallocation) ===\n"; + vec.emplace_back("AAA"); + vec.emplace_back("BBB"); + vec.emplace_back("CCC"); // 这里触发扩容 + + std::cout << "\n=== final contents ===\n"; + for (const auto& s : vec) { + std::cout << " " << s.c_str() << "\n"; + } + return 0; +} +``` + +编译运行后你会看到这样的输出(GCC 16.1.1,`-std=c++20 -O2`): + +```bash +$ g++ -std=c++20 -O2 test_noexcept.cpp && ./a.out +=== push 3 elements (triggers reallocation) === + ctor: AAA + ctor: BBB + ctor: CCC + COPY ctor: AAA # 扩容时用的是拷贝!不是移动! + COPY ctor: BBB +``` + +看到了吗?当第三个元素触发扩容时,`vector` 把前两个元素**拷贝**到了新内存——尽管我们明明实现了移动构造函数。原因就是我们的移动构造函数没有标 `noexcept`。 + +现在把移动构造函数加上 `noexcept`: + +```cpp +StringNoNoexcept(StringNoNoexcept&& o) noexcept // 加上 noexcept +``` + +重新编译运行: + +```bash +$ g++ -std=c++20 -O2 test_noexcept.cpp && ./a.out +=== push 3 elements (triggers reallocation) === + ctor: AAA + ctor: BBB + ctor: CCC + MOVE ctor: AAA # 现在用移动了! + MOVE ctor: BBB +``` + +一个 `noexcept` 关键字的差异,直接决定了 `vector` 扩容时是拷贝还是移动。对于一个持有动态内存的类来说,在大量数据场景下,这个差异可能意味着数量级的性能差距。 + +这是一个真正的生产级陷阱。很多人写了移动构造函数但忘了加 `noexcept`,然后在性能测试中困惑于"为什么移动语义没有生效"。答案往往就在这两个字上。 + +## 完整的 MyString:五巨头齐聚 + +把本篇和前两篇的内容合在一起,我们得到了一个完整的、符合 Rule of Five 的 `MyString` 实现: + +```cpp +#include +#include + +class MyString +{ + std::size_t stored_length_; + char* actual_str_; + +public: + // 构造函数 + explicit MyString(const char* s = "") + : stored_length_(std::strlen(s)) + , actual_str_(new char[stored_length_ + 1]) + { + std::memcpy(actual_str_, s, stored_length_ + 1); + } + + // 析构函数 + ~MyString() + { + delete[] actual_str_; + } + + // 拷贝构造函数 + MyString(const MyString& other) + : stored_length_(other.stored_length_) + , actual_str_(new char[other.stored_length_ + 1]) + { + std::memcpy(actual_str_, other.actual_str_, stored_length_ + 1); + } + + // 移动构造函数 —— noexcept! + MyString(MyString&& s) noexcept + : stored_length_(s.stored_length_) + , actual_str_(s.actual_str_) + { + s.actual_str_ = nullptr; + s.stored_length_ = 0; + } + + // 拷贝赋值运算符 + MyString& operator=(const MyString& other) + { + if (this != &other) { + delete[] actual_str_; + stored_length_ = other.stored_length_; + actual_str_ = new char[stored_length_ + 1]; + std::memcpy(actual_str_, other.actual_str_, stored_length_ + 1); + } + return *this; + } + + // 移动赋值运算符 —— noexcept! + MyString& operator=(MyString&& s) noexcept + { + if (this != &s) { + delete[] actual_str_; + stored_length_ = s.stored_length_; + actual_str_ = s.actual_str_; + s.actual_str_ = nullptr; + s.stored_length_ = 0; + } + return *this; + } + + const char* c_str() const { return actual_str_ ? actual_str_ : "(null)"; } + std::size_t size() const { return stored_length_; } +}; +``` + +五个特殊成员函数——析构函数、拷贝构造、拷贝赋值、移动构造、移动赋值——全部到齐。这就是所谓的 Rule of Five:如果你需要自定义其中任何一个,那你大概率需要自定义全部五个。编译器生成的默认版本对持有裸指针的类来说是不安全的。 + +## 到这里搞清楚了什么 + +三篇文章走下来,我们从 `swap` 的三次深拷贝出发,经过左值右值的值类别体系,最终在这篇把移动操作的全部实现细节拆解清楚了。让我用一个简洁的清单来回顾本篇的核心要点。 + +移动构造函数的核心是"破坏性拷贝"——偷走源对象的资源指针,然后把源对象置成无害状态。重载决议自动选择拷贝还是移动,你不需要在调用点做额外判断。`std::move` 不移动任何东西,它只是一个到右值引用的类型转换,使得重载决议能够选择移动版本。右值引用参数在函数内部是左值——因为它有名字——所以你仍然需要 `std::move` 才能从中移动。`return` 语句是"有名字就是左值"规则的例外,编译器会自动识别隐式可移动的返回表达式。NRVO 可以让返回值零成本到达调用方——而 `return std::move(temp)` 会阻止 NRVO,千万别这么写。移动后的对象处于"有效但未指定"的状态,唯一安全的操作是赋新值或析构。移动构造函数一定要标 `noexcept`——否则 `std::vector` 扩容时会退回拷贝,性能差距可能非常大。 + +如果你想继续深入移动语义的更多应用场景——完美转发、万能引用、引用折叠——可以看 vol2 的 [完美转发:保持值类别的精确传递](../../../vol2-modern-features/ch00-move-semantics/04-perfect-forwarding.md)。移动语义和完美转发搭配使用,才是现代 C++ 模板编程的完整基础。 + + + + + + + + + + diff --git a/documents/vol10-open-lecture-notes/cppcon/2025/04-back-to-basics-move-semantics/index.md b/documents/vol10-open-lecture-notes/cppcon/2025/04-back-to-basics-move-semantics/index.md new file mode 100644 index 000000000..24ee11240 --- /dev/null +++ b/documents/vol10-open-lecture-notes/cppcon/2025/04-back-to-basics-move-semantics/index.md @@ -0,0 +1,34 @@ +--- +title: "Back to Basics: Move Semantics" +description: "CppCon 2025 演讲笔记 —— Ben Saks:C++ 移动语义基础入门" +conference: cppcon +conference_year: 2025 +talk_title: "Back to Basics: Move Semantics" +speaker: "Ben Saks" +video_bilibili: "https://www.bilibili.com/video/BV1X54y1P7uM" +video_youtube: "https://www.youtube.com/watch?v=szU5b972F7E" +tags: + - cpp-modern + - host + - beginner +difficulty: beginner +platform: host +cpp_standard: [11, 17, 20] +--- + + + +## 笔记目录 + + + 拷贝的开销与移动的动机:从 swap 到 MyString + 左值、右值与引用:移动语义的类型系统基石 + 移动操作、std::move 与拷贝消除 + diff --git a/documents/vol10-open-lecture-notes/cppcon/2025/index.md b/documents/vol10-open-lecture-notes/cppcon/2025/index.md index 8ac3a22f9..a483c8ff3 100644 --- a/documents/vol10-open-lecture-notes/cppcon/2025/index.md +++ b/documents/vol10-open-lecture-notes/cppcon/2025/index.md @@ -38,3 +38,32 @@ CppCon 2025 的演讲笔记合集。 C++:底层汇编探秘 + +--- + + + + + Back to Basics: C++ Ranges + + +--- + + + + + Back to Basics: Move Semantics + From 5305931123f199406e79ef893dd4b2489908a007 Mon Sep 17 00:00:00 2001 From: Charliechen114514 <725610365@qq.com> Date: Sat, 13 Jun 2026 10:55:39 +0800 Subject: [PATCH 2/3] ci fix: broken links --- .../03-back-to-basics-ranges/01-from-loops-to-iterators.md | 2 +- .../01-copy-cost-and-motivation.md | 4 ++-- .../02-lvalue-rvalue-and-references.md | 4 ++-- .../03-move-ops-stdmove-and-elision.md | 4 ++-- .../03-back-to-basics-ranges/01-from-loops-to-iterators.md | 2 +- .../01-copy-cost-and-motivation.md | 4 ++-- .../02-lvalue-rvalue-and-references.md | 4 ++-- .../03-move-ops-stdmove-and-elision.md | 4 ++-- 8 files changed, 14 insertions(+), 14 deletions(-) diff --git a/documents/en/vol10-open-lecture-notes/cppcon/2025/03-back-to-basics-ranges/01-from-loops-to-iterators.md b/documents/en/vol10-open-lecture-notes/cppcon/2025/03-back-to-basics-ranges/01-from-loops-to-iterators.md index 83b033606..2f9ac7f75 100644 --- a/documents/en/vol10-open-lecture-notes/cppcon/2025/03-back-to-basics-ranges/01-from-loops-to-iterators.md +++ b/documents/en/vol10-open-lecture-notes/cppcon/2025/03-back-to-basics-ranges/01-from-loops-to-iterators.md @@ -358,7 +358,7 @@ Starting from the most primitive index-based `for`, we saw how "traversal" was a The core takeaway is one sentence: **a pair of iterators (one `begin`, one `end`) defines a range, and STL algorithms are built on top of this pair of iterators.** -In the next article, we'll hand this pair of iterators to STL algorithms—seeing how `std::sort`, `std::partition`, and `std::transform` work as "loop replacements," and what hard requirements they have on iterator categories (for example, why `std::sort` can't be used on `std::list`). There are also a few classic iterator pitfalls waiting for us there: iterator invalidation, mismatched `begin`/`end`, and reversed argument order. If you want to review container memory layouts first, vol3's [span: A View That Doesn't Own Data](../../../vol3-standard-library/02-span.md) and the container-related articles are excellent prerequisite reading. +In the next article, we'll hand this pair of iterators to STL algorithms—seeing how `std::sort`, `std::partition`, and `std::transform` work as "loop replacements," and what hard requirements they have on iterator categories (for example, why `std::sort` can't be used on `std::list`). There are also a few classic iterator pitfalls waiting for us there: iterator invalidation, mismatched `begin`/`end`, and reversed argument order. If you want to review container memory layouts first, vol3's [span: A View That Doesn't Own Data](../../../../vol3-standard-library/02-span.md) and the container-related articles are excellent prerequisite reading. — the compiler must eliminate the copy, rather than "can eliminate it but doesn't have to." This isn't an optional optimization anymore; it's a defined behavior of the language. For historical reasons it's still called an "optimization," but it's actually a guarantee. -For the complete technical details of NRVO and RVO, we previously had a dedicated article in vol2: [RVO and NRVO: Compiler Return Value Optimization](../../../vol2-modern-features/ch00-move-semantics/03-rvo-nrvo.md). +For the complete technical details of NRVO and RVO, we previously had a dedicated article in vol2: [RVO and NRVO: Compiler Return Value Optimization](../../../../vol2-modern-features/ch00-move-semantics/03-rvo-nrvo.md). ## Never Use std::move on Return Values @@ -536,7 +536,7 @@ Across three articles, we started from the three deep copies of `swap`, went thr The core of a move constructor is "destructive copy" — steal the source object's resource pointer, then set the source object to a harmless state. Overload resolution automatically selects between copy and move; you don't need to make extra judgments at the call site. `std::move` doesn't move anything; it's simply a cast to an rvalue reference that enables overload resolution to select the move version. An rvalue reference parameter is an lvalue inside a function — because it has a name — so you still need `std::move` to move from it. The `return` statement is an exception to the "if it has a name, it's an lvalue" rule; the compiler automatically identifies implicitly movable return expressions. NRVO can deliver return values to the caller at zero cost — and `return std::move(temp)` prevents NRVO, so never write it that way. A moved-from object is in a "valid but unspecified" state; the only safe operations are assigning a new value or destructing it. Move constructors must be marked `noexcept` — otherwise `std::vector` will fall back to copying during reallocation, and the performance gap can be enormous. -If you want to dive deeper into more application scenarios of move semantics — perfect forwarding, universal references, reference collapsing — check out vol2's [Perfect Forwarding: Preserving Exact Value Category Propagation](../../../vol2-modern-features/ch00-move-semantics/04-perfect-forwarding.md). Move semantics combined with perfect forwarding form the complete foundation of modern C++ template programming. +If you want to dive deeper into more application scenarios of move semantics — perfect forwarding, universal references, reference collapsing — check out vol2's [Perfect Forwarding: Preserving Exact Value Category Propagation](../../../../vol2-modern-features/ch00-move-semantics/04-perfect-forwarding.md). Move semantics combined with perfect forwarding form the complete foundation of modern C++ template programming. & v) 核心就一句话:**一对迭代器(一个 `begin`、一个 `end`)定义了一个 range,而 STL 算法就建立在这对迭代器之上。** -下一篇我们就把这对迭代器交给 STL 算法——看 `std::sort`、`std::partition`、`std::transform` 这些「循环的替代品」怎么用,以及它们对迭代器类别有什么硬性要求(比如 `std::sort` 为什么不能用在 `std::list` 上)。那里还有几个迭代器的经典陷阱等着我们:迭代器失效、配错 `begin`/`end`、参数顺序写反。如果你想先复习一下容器本身的内存布局,vol3 的 [span:不拥有数据的视图](../../../vol3-standard-library/02-span.md) 和容器相关文章是很好的前置阅读。 +下一篇我们就把这对迭代器交给 STL 算法——看 `std::sort`、`std::partition`、`std::transform` 这些「循环的替代品」怎么用,以及它们对迭代器类别有什么硬性要求(比如 `std::sort` 为什么不能用在 `std::list` 上)。那里还有几个迭代器的经典陷阱等着我们:迭代器失效、配错 `begin`/`end`、参数顺序写反。如果你想先复习一下容器本身的内存布局,vol3 的 [span:不拥有数据的视图](../../../../vol3-standard-library/02-span.md) 和容器相关文章是很好的前置阅读。 ——编译器必须消除拷贝,而不是"可以消除但也可以不消除"。这不是一个可选的优化,而是语言的定义行为。历史原因让它还叫"优化",但实际上它已经是一种保证了。 -关于 NRVO 和 RVO 的完整技术细节,我们之前在 vol2 有专门的文章讲解:[RVO 与 NRVO:编译器的返回值优化](../../../vol2-modern-features/ch00-move-semantics/03-rvo-nrvo.md)。 +关于 NRVO 和 RVO 的完整技术细节,我们之前在 vol2 有专门的文章讲解:[RVO 与 NRVO:编译器的返回值优化](../../../../vol2-modern-features/ch00-move-semantics/03-rvo-nrvo.md)。 ## 千万别对返回值用 std::move @@ -526,7 +526,7 @@ public: 移动构造函数的核心是"破坏性拷贝"——偷走源对象的资源指针,然后把源对象置成无害状态。重载决议自动选择拷贝还是移动,你不需要在调用点做额外判断。`std::move` 不移动任何东西,它只是一个到右值引用的类型转换,使得重载决议能够选择移动版本。右值引用参数在函数内部是左值——因为它有名字——所以你仍然需要 `std::move` 才能从中移动。`return` 语句是"有名字就是左值"规则的例外,编译器会自动识别隐式可移动的返回表达式。NRVO 可以让返回值零成本到达调用方——而 `return std::move(temp)` 会阻止 NRVO,千万别这么写。移动后的对象处于"有效但未指定"的状态,唯一安全的操作是赋新值或析构。移动构造函数一定要标 `noexcept`——否则 `std::vector` 扩容时会退回拷贝,性能差距可能非常大。 -如果你想继续深入移动语义的更多应用场景——完美转发、万能引用、引用折叠——可以看 vol2 的 [完美转发:保持值类别的精确传递](../../../vol2-modern-features/ch00-move-semantics/04-perfect-forwarding.md)。移动语义和完美转发搭配使用,才是现代 C++ 模板编程的完整基础。 +如果你想继续深入移动语义的更多应用场景——完美转发、万能引用、引用折叠——可以看 vol2 的 [完美转发:保持值类别的精确传递](../../../../vol2-modern-features/ch00-move-semantics/04-perfect-forwarding.md)。移动语义和完美转发搭配使用,才是现代 C++ 模板编程的完整基础。 Date: Sat, 13 Jun 2026 10:58:46 +0800 Subject: [PATCH 3/3] ci fix: broken links --- .../01-copy-cost-and-motivation.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/documents/en/vol10-open-lecture-notes/cppcon/2025/04-back-to-basics-move-semantics/01-copy-cost-and-motivation.md b/documents/en/vol10-open-lecture-notes/cppcon/2025/04-back-to-basics-move-semantics/01-copy-cost-and-motivation.md index 99ecb2774..0024a9d78 100644 --- a/documents/en/vol10-open-lecture-notes/cppcon/2025/04-back-to-basics-move-semantics/01-copy-cost-and-motivation.md +++ b/documents/en/vol10-open-lecture-notes/cppcon/2025/04-back-to-basics-move-semantics/01-copy-cost-and-motivation.md @@ -415,7 +415,7 @@ Good question. SSO means that if a string is short enough (the threshold in libs But once a string exceeds the SSO threshold, `std::string` falls back to heap allocation, and the advantage of move semantics becomes fully apparent — one pointer swap vs one `malloc` + `memcpy`. Moreover, even for short strings, move semantics allows the compiler to avoid unnecessary copies in more scenarios. -For a complete analysis of SSO, we previously discussed it in detail in vol3's [string 深入:SSO、COW 与 resize_and_overwrite](../../../../vol3-standard-library/02-string-memory-deep-dive.md), so we will not expand on it here. +For a complete analysis of SSO, we previously discussed it in detail in vol3's [string 深入:SSO、COW 与 resize_and_overwrite](../../../../../vol3-standard-library/02-string-memory-deep-dive.md), so we will not expand on it here. ## What We Have Figured Out So Far