Tokenize a String

◀ Conversion Between String and Number▶ Display Invisible Characters
Amazon This function is one of the most useful functions I have ever written. It accepts a string of anything, including spaces and tabs, then returns a vector that contains individual words in that string. You can include as many spaces or tabs before, after the string and even between any two words in the string.

This function will ignore all the spaces and retrieve individual words. This function is useful in many ways. For example, you may want to keep track of information in the form of a string of data. Then you can use this function to retrieve individual items.


Another example is that maybe you expect user to give you many data items with a space being the delimiter. Obviously human is error-prone and may enter more than 1 space between 2 data items or even enter tabs. In those situations, using this function to process those data is a good idea. By the way, this function is already covered in Chapter 12.9.
#include <vector>
/*
precondition: s contains tokens separated by unprintable characters
postcondition: returns a vector that contains all tokens of s
*/ vector<string> tokenize(string s){ int i,j; vector<string> vs; int slen=s.length(); j=0; /* use a while loop to retrieve all tokens */ while(j<slen){ /* skip all unprintable characters until a printable character is reached */ while(!isgraph(s[j++]) && j<slen) ; if(j>=slen) return vs; i=j-1; /* take all characters until an unprintable character is reached */ while(isgraph(s[j++]) && j<slen)
; /* store it in the vector */ if(j>=slen) j++; vs.push_back(s.substr(i,j-i-1)); } /* return the vector */ return vs; }
◀ Conversion Between String and Number▶ Display Invisible Characters

fShare
Questions? Let me know!