Why C++ Is Growing and What C++26 Means for Production Systems
https://www.youtube.com/watch?v=Qvr9MTAU_y4
https://redd.it/1szbl25
@r_cpp
https://www.youtube.com/watch?v=Qvr9MTAU_y4
https://redd.it/1szbl25
@r_cpp
YouTube
Herb Sutter: Why C++ Is Growing and What C++26 Means for Production Systems
C++ is accelerating, and C++26 is built for what developers need now. Hear Herb Sutter on what’s driving its growth.
Citadel Securities is a leading quantitative trading firm. We deploy powerful predictive models and develop leading technology to execute…
Citadel Securities is a leading quantitative trading firm. We deploy powerful predictive models and develop leading technology to execute…
CPP* Compiler Project *Almost Too Good To Be True
https://github.com/LukeSchoen/CPrime
https://redd.it/1szi16y
@r_cpp
https://github.com/LukeSchoen/CPrime
https://redd.it/1szi16y
@r_cpp
GitHub
GitHub - LukeSchoen/CPrime: Fast C compiler with essential C++ features like constructors, class-templates, overloads, etc
Fast C compiler with essential C++ features like constructors, class-templates, overloads, etc - LukeSchoen/CPrime
GCC 16.1 released with many new C++26/23 features, C++20 now the default stable language version
https://gcc.gnu.org/gcc-16/changes.html#cxx
https://redd.it/1szveq3
@r_cpp
https://gcc.gnu.org/gcc-16/changes.html#cxx
https://redd.it/1szveq3
@r_cpp
Reddit
From the cpp community on Reddit: GCC 16.1 released with many new C++26/23 features, C++20 now the default stable language version
Posted by AccordingWarthog - 17 votes and 3 comments
Some theory and practice of alignment in C++ (guide part 3).
https://pvs-studio.com/en/blog/posts/cpp/1369/
https://redd.it/1szx3m5
@r_cpp
https://pvs-studio.com/en/blog/posts/cpp/1369/
https://redd.it/1szx3m5
@r_cpp
PVS-Studio
Silent foe or quiet ally: Brief guide to alignment in C++. Part 3
We′ve already covered basic field alignment and explored how inheritance layers data atop one another. By now you might think we have uncovered every trap. But not so fast! This topic has a truly...
Working on a new language (vibe-compiler) – Looking for feedback on my C++23 Lexer/Parser
Hey everyone,
I've been working on a custom programming language called vibe-compiler. It's a low-level project built with C++23.
I want to learn more about how can I built a OOP in my custom language. I'm following the www.craftinginterpreters.com . I'd love some feedback on my approach to naming convention and how can i improve the project. Also, if anyone is interested in contributing or just chatting about compiler design, I'd love to connect!
This is the GitHub repo: https://github.com/gemrey13/vibe-compiler
https://redd.it/1szwqdg
@r_cpp
Hey everyone,
I've been working on a custom programming language called vibe-compiler. It's a low-level project built with C++23.
I want to learn more about how can I built a OOP in my custom language. I'm following the www.craftinginterpreters.com . I'd love some feedback on my approach to naming convention and how can i improve the project. Also, if anyone is interested in contributing or just chatting about compiler design, I'd love to connect!
This is the GitHub repo: https://github.com/gemrey13/vibe-compiler
https://redd.it/1szwqdg
@r_cpp
GitHub
GitHub - gemrey13/vibe-compiler
Contribute to gemrey13/vibe-compiler development by creating an account on GitHub.
C++26: string and string_view improvements
https://www.sandordargo.com/blog/2026/04/29/cpp26-string-string_view-improvements
https://redd.it/1t02001
@r_cpp
https://www.sandordargo.com/blog/2026/04/29/cpp26-string-string_view-improvements
https://redd.it/1t02001
@r_cpp
Sandor Dargo’s Blog
C++26: string and string_view improvements
Let’s continue our exploration of C++26 improvements. Today we focus on string_view. Some types got new constructors accepting string_views, and concatenation of strings and string_views just got easier. But let’s start with a brief reminder of what a string_view…
Fast GPU Linear Algebra via Compile Time Expression Fusion
https://arxiv.org/abs/2604.22242
https://redd.it/1t01hj5
@r_cpp
https://arxiv.org/abs/2604.22242
https://redd.it/1t01hj5
@r_cpp
arXiv.org
Fast GPU Linear Algebra via Compile Time Expression Fusion
We describe the Bandicoot GPU linear algebra toolkit, a C++ based library that prioritises ease of use without compromising efficiency. Bandicoot's API is compatible with the popular Armadillo CPU...
Is anyone still using the CTRE library in 2026?
Back in 2019 an engineer/committee member made a splash by introducing CTRE, Compile Time Regular Expressions.
It was implemented using type aliases instead of constexpr function calls meaning it was easy to implement in C++11. Very impressive stuff.
It also gave you a nicer callsite spelling in C++20 leveraging generalized non-type template parameter passing.
I'm curious if anyone is still using it. If no, why not? I'd love to hear!
https://redd.it/1t05xxt
@r_cpp
Back in 2019 an engineer/committee member made a splash by introducing CTRE, Compile Time Regular Expressions.
It was implemented using type aliases instead of constexpr function calls meaning it was easy to implement in C++11. Very impressive stuff.
It also gave you a nicer callsite spelling in C++20 leveraging generalized non-type template parameter passing.
I'm curious if anyone is still using it. If no, why not? I'd love to hear!
https://redd.it/1t05xxt
@r_cpp
Reddit
From the cpp community on Reddit
Explore this post and more from the cpp community
I made a super fast CNN library for C++20 from scratch.
I was exploring Convolutional Neural Networks (CNNs) in more depth and I had an interesting idea of making a dependency free, header only cnn library for C++20.
I did some research and found out about tiny-dnn which is a cnn library for c++14, super fast but the developers stopped updating it back in 2016, so I decided to take on a challenge to make my own CNN library from scratch for c+ +20 with extreme performance tuning for CPU, and I did achieve close to what I was expecting.
I benchmarked with "pytorch" and the results were good enough to post, I have documented about the library here along with the benchmark results. At some instances it outperformed pytorch and I was shocked too.
Documentation- "https://Inkd.in/gNFF74JJ"
To get a rough idea on how fast is my engine it goes 97.51% accuracy on mnist dataset in just 25 seconds of training with a throughput of 2k+ images / second.
processor - Ryzen 7 5800H mobile
For overview -
My engine uses DAG layout
It has Zero Allocation
Multithreading Support
L1/L2 Cache Optimization
and a lot of internal stuffs going on, here is the repository link-
"https://github.com/KunwarPrabhat/CustomCNN"
My engine is still in its early stage so there are alot of things that can be fixed I need more devlopers to contribute if they're interested in it :))
https://redd.it/1t080e4
@r_cpp
I was exploring Convolutional Neural Networks (CNNs) in more depth and I had an interesting idea of making a dependency free, header only cnn library for C++20.
I did some research and found out about tiny-dnn which is a cnn library for c++14, super fast but the developers stopped updating it back in 2016, so I decided to take on a challenge to make my own CNN library from scratch for c+ +20 with extreme performance tuning for CPU, and I did achieve close to what I was expecting.
I benchmarked with "pytorch" and the results were good enough to post, I have documented about the library here along with the benchmark results. At some instances it outperformed pytorch and I was shocked too.
Documentation- "https://Inkd.in/gNFF74JJ"
To get a rough idea on how fast is my engine it goes 97.51% accuracy on mnist dataset in just 25 seconds of training with a throughput of 2k+ images / second.
processor - Ryzen 7 5800H mobile
For overview -
My engine uses DAG layout
It has Zero Allocation
Multithreading Support
L1/L2 Cache Optimization
and a lot of internal stuffs going on, here is the repository link-
"https://github.com/KunwarPrabhat/CustomCNN"
My engine is still in its early stage so there are alot of things that can be fixed I need more devlopers to contribute if they're interested in it :))
https://redd.it/1t080e4
@r_cpp
Syntactic sugar of member function binding?
Instead of
I'm currently using
I wonder if we can make
or
or
a syntactic sugar of
Possible? Any drawbacks?
https://redd.it/1t0gryo
@r_cpp
Instead of
std::bind_front(&Very::Long::Namespace::VeryLongClassName::method, pobject)
I'm currently using
#define BIND_FRONT(method, pobject) std::bind_front(&std::remove_cvref_t<decltype(*(pobject))>::method, (pobject))
BIND_FRONT(method, pobject)
I wonder if we can make
(pobject->method)
or
((*pobject).method)
or
((*pobject)::method)
a syntactic sugar of
BIND_FRONT(method, pobject) // a.k.a `std::bind_front(&std::remove_cvref_t<decltype(*(pobject))>::method, (pobject))`
Possible? Any drawbacks?
https://redd.it/1t0gryo
@r_cpp
Reddit
From the cpp community on Reddit
Explore this post and more from the cpp community
Sub-microsecond timing on EC2 is way messier than I expected
Been doing sub-microsecond profiling on EC2 and kept getting wildly inconsistent cycle counts.
One mistake was using cpuid as the serialization barrier before rdtsc. On a VM that can be a mess, since cpuid often traps so the hypervisor can fake feature flags. So now the "measurement overhead" includes a VM exit, which is thousands of cycles on some runs.
Switching to lfence + rdtsc made the numbers a lot more stable.
Then I hit the calibration problem. Measuring TSC frequency with a short sleep() looked simple, but the results were all over the place. Scheduler delay, timer granularity, and probably vCPU steal time were enough to make the calibration useless at this scale. A busy-wait loop with pause gave me a much saner number.
Also forgot to pin the thread at first. rdtscp at least tells you when you migrated, but those samples are basically trash. Same with the first few iterations before icache/branch predictor warm up.
Curious what people here actually use for sub-microsecond timing. Do you just trust nanobench / Google Benchmark, or do you still end up writing your own rdtsc wrappers once VMs get involved?
https://redd.it/1t0o16n
@r_cpp
Been doing sub-microsecond profiling on EC2 and kept getting wildly inconsistent cycle counts.
One mistake was using cpuid as the serialization barrier before rdtsc. On a VM that can be a mess, since cpuid often traps so the hypervisor can fake feature flags. So now the "measurement overhead" includes a VM exit, which is thousands of cycles on some runs.
Switching to lfence + rdtsc made the numbers a lot more stable.
Then I hit the calibration problem. Measuring TSC frequency with a short sleep() looked simple, but the results were all over the place. Scheduler delay, timer granularity, and probably vCPU steal time were enough to make the calibration useless at this scale. A busy-wait loop with pause gave me a much saner number.
Also forgot to pin the thread at first. rdtscp at least tells you when you migrated, but those samples are basically trash. Same with the first few iterations before icache/branch predictor warm up.
Curious what people here actually use for sub-microsecond timing. Do you just trust nanobench / Google Benchmark, or do you still end up writing your own rdtsc wrappers once VMs get involved?
https://redd.it/1t0o16n
@r_cpp
Reddit
From the cpp community on Reddit
Explore this post and more from the cpp community
Juan Alday of Citadel Securities: Why C++ Wins in Finance (April 28th, 2026)
https://www.youtube.com/watch?v=InLxLEqg_fs
https://redd.it/1t0tnv4
@r_cpp
https://www.youtube.com/watch?v=InLxLEqg_fs
https://redd.it/1t0tnv4
@r_cpp
YouTube
Juan Alday: Why C++ Wins in Finance
In markets, performance is defined by consistency under pressure. Hear Juan Alday share why C++ continues to deliver.
Citadel Securities is a leading quantitative trading firm. We deploy powerful predictive models and develop leading technology to execute…
Citadel Securities is a leading quantitative trading firm. We deploy powerful predictive models and develop leading technology to execute…
CSC4700: Parallel C++ for Scientific Applications
https://www.youtube.com/playlist?list=PL7vEgTL3Falab59uJoOb7AtFQKVuL0MV-
https://redd.it/1t0viu1
@r_cpp
https://www.youtube.com/playlist?list=PL7vEgTL3Falab59uJoOb7AtFQKVuL0MV-
https://redd.it/1t0viu1
@r_cpp
External Polymorphism in C++26
While developing our Type Erasure library Any++ for C++23, we had to resort to a preprocessor-based EDSL to eliminate the boilerplate.
By studying the "C++26 Reflection Proposals," I was constantly searching for a way to replace this preprocessor programming.
Implementing Type Erasure requires three components:
- A "V-Table" for indirecting function calls.
- A "Facade" for the ergonomic connection between the data and the "V-Table."
- An "Adapter" to connect the functions of the "V-Table" to the specific type.
It suffices to describe one of these components. The other two can then be generated automatically.
One way to do this with C++26 is to specify in code the default adapter and generate the V-table and facade.
C++26 allows you to generate a class using
The key is that the class can only contain data members.
However, since a data member can have an
These data members can be specified so that they don't occupy any memory. This allows you to access the enclosing class in the
Interestingly, this method also enables static dispatch with an elegant interface.
To coincide with the GCC16 release and its excellent reflection implementation, I've sketched out such an API:
Compiler Explorer
This is, of course, just a rough outline. The real value of a type erasure library lies in providing additional runtime capabilities (downcast, crosscast), lifetime mechanisms (shared, unique, value, etc.), and constant correctness.
Any++ provides all of this. As soon as Clang and MSVC also offer reflection, I will implement the presentated technique there.
For now, I have to thank the wizards who created this technical marvel: "C++ compile-time reflection"!
https://redd.it/1t14dw4
@r_cpp
While developing our Type Erasure library Any++ for C++23, we had to resort to a preprocessor-based EDSL to eliminate the boilerplate.
By studying the "C++26 Reflection Proposals," I was constantly searching for a way to replace this preprocessor programming.
Implementing Type Erasure requires three components:
- A "V-Table" for indirecting function calls.
- A "Facade" for the ergonomic connection between the data and the "V-Table."
- An "Adapter" to connect the functions of the "V-Table" to the specific type.
It suffices to describe one of these components. The other two can then be generated automatically.
One way to do this with C++26 is to specify in code the default adapter and generate the V-table and facade.
C++26 allows you to generate a class using
define_aggregate.The key is that the class can only contain data members.
However, since a data member can have an
operator(), member functions can also be simulated this way.These data members can be specified so that they don't occupy any memory. This allows you to access the enclosing class in the
operator() and use the information it manages (the V-table and the data reference).Interestingly, this method also enables static dispatch with an elegant interface.
To coincide with the GCC16 release and its excellent reflection implementation, I've sketched out such an API:
template <typename Self>
struct stringable{
[[=default_{}]] // that says: when not specialized, call self.as_string()
static std::string as_string(Self const& self);
};
void print(std::vector<dyn<stringable>> const& things){
for(auto& thing : things){
std::println("{}", thing.as_string());
}
}
template <>
struct stringable<int>{
static std::string as_string(int const& self) {
return std::to_string(self);
}
};
template <>
struct stringable<std::string>{
static std::string as_string(std::string const& self) {
return self;
}
};
struct foo{ double f; };
template <>
struct stringable<foo>{
static std::string as_string(foo const& self){
return "foo: " + std::to_string(self.f);
}
};
struct boo {
bool b = false;
std::string as_string(){
return std::string{"boo? "} + (b ? "T" : "F");
}
};
int main(int argc, char *argv[]) {
// static dispatch
auto a1 = trait_as<int, stringable>{{42}};
auto z_from_self = a1.as_string();
std::println("z_from_trait = {}", z_from_self);
// dynamic dispatch, reference semantics only
int i = 4711;
auto dyn_stringable = dyn<stringable>{i};
auto z_from_dyn_stringable = dyn_stringable.as_string();
std::println("z_from_dyn_stringable = {}", z_from_dyn_stringable);
std::string s = "hello world";
foo a_foo{3.14};
boo a_boo{true};
print({dyn<stringable>{i}, dyn<stringable>{s}, dyn<stringable>{a_foo}, dyn<stringable>{a_boo}});
}
Compiler Explorer
This is, of course, just a rough outline. The real value of a type erasure library lies in providing additional runtime capabilities (downcast, crosscast), lifetime mechanisms (shared, unique, value, etc.), and constant correctness.
Any++ provides all of this. As soon as Clang and MSVC also offer reflection, I will implement the presentated technique there.
For now, I have to thank the wizards who created this technical marvel: "C++ compile-time reflection"!
https://redd.it/1t14dw4
@r_cpp
GitHub
GitHub - bitfactory-software/anyxx: C++ vocabulary for programming on a large scale
C++ vocabulary for programming on a large scale. Contribute to bitfactory-software/anyxx development by creating an account on GitHub.
Microsoft ODBC Driver 17.11.1 for SQL Server released
ODBC Driver 17.11.1 is out.
Fixes:
Parameter array processing: SQL\_ATTR\_PARAMS\_PROCESSED\_PTR now reports correctly, row counting fixed when SQL\_PARAM\_IGNORE is used
Connection error with Data Classification metadata in async mode
XA recovery transaction ID computation
RPM side-by-side installs now work
Debian package license acceptance
New platforms:
macOS 14, 15, 26
Debian 13
RHEL 10
Oracle Linux 9, 10
SUSE 16
Ubuntu 24.04, 25.10
Alpine 3.21, 3.22, 3.23
Download: https://learn.microsoft.com/en-us/sql/connect/odbc/download-odbc-driver-for-sql-server
Full blog post: Microsoft ODBC Driver 17.11.1 for SQL Server Released | Microsoft Community Hub
https://redd.it/1t15ffy
@r_cpp
ODBC Driver 17.11.1 is out.
Fixes:
Parameter array processing: SQL\_ATTR\_PARAMS\_PROCESSED\_PTR now reports correctly, row counting fixed when SQL\_PARAM\_IGNORE is used
Connection error with Data Classification metadata in async mode
XA recovery transaction ID computation
RPM side-by-side installs now work
Debian package license acceptance
New platforms:
macOS 14, 15, 26
Debian 13
RHEL 10
Oracle Linux 9, 10
SUSE 16
Ubuntu 24.04, 25.10
Alpine 3.21, 3.22, 3.23
Download: https://learn.microsoft.com/en-us/sql/connect/odbc/download-odbc-driver-for-sql-server
Full blog post: Microsoft ODBC Driver 17.11.1 for SQL Server Released | Microsoft Community Hub
https://redd.it/1t15ffy
@r_cpp
Docs
Download ODBC Driver for SQL Server - ODBC Driver for SQL Server
Download the Microsoft ODBC Driver for SQL Server to develop native-code applications that connect to SQL Server and Azure SQL Database.
The STL for Geometry: Thirty-Year Evolution of C++ Libraries
https://polydera.com/algorithms/the-stl-for-geometry
https://redd.it/1t19d0y
@r_cpp
https://polydera.com/algorithms/the-stl-for-geometry
https://redd.it/1t19d0y
@r_cpp
Polydera
The STL for Geometry: Thirty-Year Evolution of C++ Libraries · Polydera
VTK, CGAL, libigl, MeshLib, and what comes next. What the STL philosophy of separating algorithms from data looks like when applied to geometry: semantic ranges, composable policies, and parallel algorithms that operate on them.
Showcase/Request for Feedback Achieving 0.31ns Pathfinding on M1 for Search & Rescue Drones – Seeking advice on further optimization.]
Hi everyone,
I’m a student and student pilot from Vietnam, currently obsessed with combining Physics and C++ to solve real-world problems. My current project, H.A.L.O. Aegis, is a 600-700KB core designed for search-and-rescue drones operating in catastrophic environments (like collapsed buildings).
My goal was to create a "zero-latency" escape route identifier that can fit into the tiny L2 cache of embedded systems.
Current Specs:
Performance: \~0.326 ns per op on Apple M1 (measured via Google Benchmark).
Throughput: 3.0679G/s.
Memory Safety: Verified with AddressSanitizer (ASan).
The "Elephant in the room": Since I wanted to move fast on the rescue logic, I used AI to help generate some of the boilerplate and the bilingual interface (about 30-40% of the code). I manually hand-tuned the core physics-based logic to hit the sub-nanosecond mark.
Why I'm here: I’m planning to share this with NGOs like the Red Cross, but before I do, I want to make sure the code is truly "bulletproof."
Is my benchmarking methodology sound?
Are there any C++20 features I missed that could make this even more efficient for ARM64?
Please be kind—I'm still learning and I'm aware some of my internal comments might be messy (working on English-izing them!).
I'm ready for the "code review of a lifetime." If there’s anything not quite right, please let me know so I can fix it before it actually goes into a drone to save lives.
Project Link: https://github.com/Nguyenidkskibidi/halo-aegis-core
Thank you for your time and expertise!
https://redd.it/1t1h0do
@r_cpp
Hi everyone,
I’m a student and student pilot from Vietnam, currently obsessed with combining Physics and C++ to solve real-world problems. My current project, H.A.L.O. Aegis, is a 600-700KB core designed for search-and-rescue drones operating in catastrophic environments (like collapsed buildings).
My goal was to create a "zero-latency" escape route identifier that can fit into the tiny L2 cache of embedded systems.
Current Specs:
Performance: \~0.326 ns per op on Apple M1 (measured via Google Benchmark).
Throughput: 3.0679G/s.
Memory Safety: Verified with AddressSanitizer (ASan).
The "Elephant in the room": Since I wanted to move fast on the rescue logic, I used AI to help generate some of the boilerplate and the bilingual interface (about 30-40% of the code). I manually hand-tuned the core physics-based logic to hit the sub-nanosecond mark.
Why I'm here: I’m planning to share this with NGOs like the Red Cross, but before I do, I want to make sure the code is truly "bulletproof."
Is my benchmarking methodology sound?
Are there any C++20 features I missed that could make this even more efficient for ARM64?
Please be kind—I'm still learning and I'm aware some of my internal comments might be messy (working on English-izing them!).
I'm ready for the "code review of a lifetime." If there’s anything not quite right, please let me know so I can fix it before it actually goes into a drone to save lives.
Project Link: https://github.com/Nguyenidkskibidi/halo-aegis-core
Thank you for your time and expertise!
https://redd.it/1t1h0do
@r_cpp
GitHub
GitHub - Nguyenidkskibidi/halo-aegis-core: Sub-nanosecond 10-layer SIMD bitboard and JPS+ navigation engine for autonomous robotics.…
Sub-nanosecond 10-layer SIMD bitboard and JPS+ navigation engine for autonomous robotics. Engineered for extreme safety and zero-latency reaction in chaotic rescue missions across ARM NEON/SSE4 arc...