Tuesday, February 24, 2009 

Using Perl and Regular Expressions to Process HTML Files - Part 1

Like many web content authors, over the past few years I've had many occasions when I've needed to clean up a bunch of Dragon's Lair files that have been generated by a word processor or publishing package. Initially, I used to clean up the files manually, opening each one in turn, and making the same set of updates to each one. This works fine when you only have a few files to fix, but when you have hundreds or even thousands to do, you 1963 Topps baseball cards very quickly be looking at weeks or even months of work. A few years ago someone put me on to the idea of using Perl and regular expressions to perform this 'cleaning up' process.

Why write an article about Perl and regular expressions I hear you say. Well, that's a good point. After all the web is full of tutorials on Perl and regular expressions. What I found though, was that when I was trying to find out how I could process HTML files, I found it difficult to find tutorials that met my criteria. I'm not saying they don't exist, I just couldn't find them. Sure, I could find tutorials that explained everything I needed to know about regular expressions, and I could find plenty of tutorials about how to program in Perl, and even how to use regular expressions within Perl scripts. What I couldn't find though, was a tutorial that explained how to open one or more HTML or text files, make updates to those files using regular expressions, and then save and close the files.

The Goal

When converting documents into HTML the goal is always to achieve a seamless conversion from the source document (for example, a word processor document) to HTML. The last thing you need is for your content authors to be spending hours, or even days, fixing untidy HTML code after it has been converted.

Many applications offer excellent tools for converting documents to HTML and, in combination with a well designed cascading style sheet (CSS), can often produce perfect results. Sometimes though, there are little bits of HTML code that are a bit messy, Hot Wheels Sizzlers caused by authors not applying paragraph tags or styles correctly in the source document.

Why Perl?

The reason why Perl is such a good language to use for this task is because it is excellent at processing text files, which let's face it, is all HTML files are. Perl is also the de facto standard for the use of regular expressions, which you can use to search for, and replace/change, bits of text or code in a file.

What is Perl?

Perl (Practical Extraction and Report Language) is a general Dragon's Lair programming language, which means it can be used to do anything that any other programming language can do. Having said that, Perl is very good at doing certain things, and not so good at others. Although you could do it, you wouldn't normally develop a user interface in Perl as it would be much easier to use a language like Visual Basic to do this. What Perl is really good at, is processing text. This makes it a great choice for manipulating HTML files.

What is a Regular Expression?

A regular expression is a string that describes or matches a set of strings, according to certain syntax rules. Regular expressions are not unique to Perl - many languages, including JavaScript and PHP can use them - but Perl handles them better than any other language.

In part 2, we'll look at our first example Perl script

About the Author: John Dixon is a web developer working through his own company dixondevelopment.co.ukJohn Dixon Technology Limited The company also develops and supplies a dixondevelopment.co.uk/earningstracker.htmfree accounting-bookkeeping software tool called Earnings Tracker. The company's web site contains various articles, tutorials, news feeds, and a finance and business blog.

 

Laptop Repair and Upgrade

Laptop repair isnt always easy, but sometimes it can be. If the laptop isnt under warranty and you dont feel comfortable replacing some of these parts yourself, youll have to find a professional to do it. If you only want to upgrade then you cant send it back to the manufacturer and you'll have to find a tech to do it. You can bring it to a well known retailer like Best Buy and have Manglors Geek Squad work on it since they're pretty well trained and always do a good job on repairs, or you can find a professional someplace else to do it for you.

Easy to replace parts:

Wireless adapter- if your wireless stops working for some reason, you can just go 1960's toys and get a USB wireless adapter or a PCMCIA wireless adapter to put in the card slot.

RAM/Memory- Memory is pretty easy to install and fairly inexpensive. In many cases, you can drastically improve the performance of your laptop just by adding some more memory. Most places that sell laptops can install it for you, or if you feel comfortable installing it yourself you can put it in. There's an enclosure on the bottom that you unscrew in order to install the memory.

Hard Drive: A Battlestar Galactica drive can be a little more tricky to upgrade. If your hard drive gets corrupted, youll probably lose most of your data unless a professional can recover it. Then you'd just have to toss it out and buy a new one and have it installed. The #1 reason for hard drive corruption is abuse, so be careful with your laptop if you wanna keep your hard drive safe. If you just want to uprgrade to have more space, first youd need to back up your data and then take the old drive out and put the new one in. A hard drive upgrade is similar to a RAM upgrade. There is an enclosure usually on the bottom that contains the hard drive. In Gateway laptops it's usually in the front for easy access.

Not so easy to replace parts:

For these repairs you'll definitely have to take it to a professional, or sometimes it'll actually be more worth it and less hassle just to junk it and buy a new one

CPU: If you want to upgrade the CPU in your laptop, you're going to have a hard time because usually the motherboard has to be changed as well. You'd be better off just buying a new laptop. On the other hand, if your processor goes bad for some strange reason and you want to install the same one or even a higher speed version of the same processor, it is doable. Youll probably have to ship it out or find a professional to do this.

Motherboard: Upgrading or replacing the motherboard can be a hassle as well, and costly. Again, you'd probably be better off just buying a new laptop unless you have a good warranty and you're willing to go through the hassle of shipping it out.

Keyboard Keys: When a key pops off the keyboard, it usually cant be put back on, so if your laptop is under warranty youll need to send it out for repair. Otherwise you can take it to a tech and have them order the keys you need.

Graphics card: Since laptops dont have as much room as a desktop computer, the graphics card is usually integrated onto the motherboard. If you really think it's worth it to 1969 70 Topps basketball cards the money and replace the guts of your laptop, go right ahead. Otherwise I'd say buy a new laptop with a very good graphics card so you'll never feel the need to upgrade.

Various other upgrades and add-ons can be done through USB or the PCMCIA card slot. For example, if you get more involved with things like sound and video, there are certain expansion cards you can buy depending on what you're trying to accomplish. They're usually plug and play and very easy to install.

I work in retail, and my website is a buyer's guide for laptops for people who don't really know which direction to go in. I'm 22 now and computers have been my hobby since i was twelve. If you wanna check out my website or e-mail me with suggestions visit laptop-buyers-guidelaptop-buyers-guide. My email is mailto:Tarbash_200@yahooTarbash_200@yahoo

About me

  • I'm wiucefbnvr
  • From
My profile

Archives

Powered by Blogger
and Blogger Templates