Monday, May 7, 2012

HTML to text file converter

Reason: The downloaded HTML novels included ads when you read them in webbrowser. If you have plain text file, you can read them in your Android system.

1. Download HTML files by using:

wget -r -l1 your_internet_folder

2. Generate one big file:

cat /your/path/to/html/*.html > t.txt

3. The following methods just valid for content lines start with special_string and end with control_string
:

grep "special_string" t.txt > tt.txt
sed 's/special_string//g' tt.txt > ttt.txt
sed 's/control_string//g' ttt.txt > your_file_name.txt