After reading a fantastic article by Ian Whitney, it came to my attention that there is some confusion regarding the “length” of a string in Rust. According to the documentation, std::string::String.len()
returns the number of bytes that are in the given string. On a technical level, there is nothing confusing about this definition. However, it is widely accepted by other languages (like Java and Ruby) that the “length” of a string is the number of characters within the string.
The problem with this difference in definition is brought to light in a playpen by respeccing which shows that Rust’s std::String::String.len()
function produces counter-intuitive results. Two strings with the same character count return different “lengths” because they contain a different number of bytes.
The solution to this is instead to use a String’s character iterator and count the number of elements, as std::string::String.chars().count()
does.