Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 39 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
39
Dung lượng
168,5 KB
Nội dung
Introduction to Perl Matt Hudson (with thanks to Stuart Brown of NYU, for some great examples and teaching ideas) Review • blastall: Do a blast search • HMMER hmmpfam: search against HMM database hmmsearch: search proteins with HMM hmmbuild: make an HMM from a protein alignment, as made by clustalw • clustalw: align protein or DNA sequences • fasta34: search a sequence using an older, slower, but sometimes more flexible algorithm grep – my favorite • Allows you to pick out lines of a text file that match a query, count them, and retrieve lines around the match. grep ‘Query=’ myblast.txt What sequences did I BLAST? grep –c ‘>’ testprotein.txt How many sequences are in this file? grep –A 10 ‘>’ testprotein.txt Give me the first ten lines of each protein ftp commands • ftp ftp.ncbi.nih.gov go to the NCBI site • open open a connection • ls same as UNIX • cd same as UNIX • get get me this file • mget get more than one file • put put a file on the server • lcd local cd • ! local shell • close close connection • bye exit the ftp program Test time • OK. You are now up and running with UNIX, and can use it to do some fairly sophisticated bioinformatics. • We’re going to concentrate on Perl scripting from now on. UNIX books • You might find that your UNIX skills need some refreshing from time to time. I recommend having one of these books around in case you need some help using the command line: • For students who haven’t done much UNIX: Sams Teach Yourself Unix in 24 Hours (4th Edition) (Sams Teach Yourself in 24 Hours) (Paperback) by Dave Taylor For more advanced UNIX users: UNIX System V: A Practical Guide (3rd Edition) (Paperback) by Mark G. Sobell • Also, for those of you not so familiar with bioinformatics: Bioinformatics for Dummies (Paperback) by Jean-Michel Claverie, Cedric Notredame, Jean-Michel Claverie, Cedric Notredame Perl books • For some reason, although there are hundreds of Perl books out there, none of them are really that good. Here are some that might be useful, but none are completely recommended. • This one I recommend EXCEPT that it uses tools that come with the book that are non- standard: Beginning Perl for Bioinformatics (Paperback) by James Tisdall This I have heard good things about but not used much myself: Beginning Perl, Second Edition (Paperback) by James Lee This is a classic but slow going if you know no programming: Learning Perl, Fourth Edition (Paperback) by Randal L. Schwartz, Tom Phoenix, brian d foy This is better if you have little programming experience, but not a textbook: Perl for Dummies (Fourth Edition) (Paperback) by Paul Hoffman • Once you get started Programming Perl, 3 rd edition, by Larry Wall, O’Reilly, 2001 Why use Perl? • Interpreted language – quick to program • Easy to learn compared to most languages • Designed for working with text files • Free for all operating systems • Most popular language in bioinformatics – many scripts available you can “borrow”, also ready made modules. Programming • In Perl, the program, or script, is just a text file. • You write it with ANY text editor (we are using WordPad and/or nano). • Run the program • Look at the output • Correct the errors (debugging) • Edit the script and try again. All programming courses traditionally start with a program that prints “Hello, world!”. So in keeping with that tradition: Note: No line numbers. Each command line ends with a semicolon Remember your program? #!/usr/bin/perl print “Hello, world\n”; [...]... changes “n” to “newline” Other examples are \t (tab) or \$ (= print an actual dollar sign, normally a dollar sign has a special meaning) Program details • Perl programs on UNIX start with a line like: #!/usr/bin /perl • Perl ignores anything after a # (this is a command not to Perl, but to the UNIX shell) • Elsewhere in the program # is used for comments to explain the code • Lines that are Perl commands... $age = ; chomp $age; if ($age < 15) { print “You are too young for this kind of work!\n”; die “too young”; } if ($age > 25) { print “You’re old enough to know better!”; die “too old”; } print “You have much to learn!\n”; Arrays • An array can store multiple pieces of data • They are essential for the most useful functions of Perl They can store data such as: – the lines of a text file (e.g primer... way to do this is to “escape” the $ print “The value of \$number is $number\n”; The value of $number is 9 Variables - summary • • • • • A variable name starts with a $ It contains a number or a text string Use my to define a variable Use = to assign a value Use \ to stop the variable being interpolated • Take care with variable names and with changing the contents of variables Standard Input • To make... programming languages use “print” to mean “write this to the console” – i.e the command line • Once opon a time, the console was a typewriter But now “print” never means print on a printer • print statements are necessary to keep tabs on what your program is doing • You need to tell Perl to put a carriage return at the end of a printed line – Use \n in a text string to signify a newline – The \ character... Input • To make the program do something, we need to input data – The angle bracket operator () tells Perl to expect input, by default from the keyboard – Usually this is assigned to a variable print “Please type a number: ”; my $num = ; print “Your number is $num\n”; chomp • When data is entered from the keyboard, the program waits for you to type the carriage return key • But the string which... exactly what to print But in order for the program to generate what is printed, we need to use variables • A variable name starts with “$” • It can be either a string or a number Assigning values In pretty much all programming languages, = means “assign this value to this variable” The “my” command in Perl initializes the variable This is optional but highly recommended So, you assign values to a variable... chomp function to remove the newline character: print $name print chomp “Enter your name: ”; = ; “Hello $name, happy to meet you!\n”; $name; print “Hello $name, happy to meet you!\n”; if and True/False • All programming works on ones and zeros – true and false if (1 == 1) { print “one equals one”; } Perl evaluates the expression (1 == 1 ) Note TWO NOT ONE EQUALS SIGNS! The if operator causes the... equal to :$%^&*, etc... that doesn’t exist, or overwriting an existing file Exercising the Perl muscles • Now let’s write a script to ask the user their age, and then deliver an insult specific to the age bracket: • Over 25 - old fogey • Under 15 – callow youth • 15-25 – (insert your own insult here) Conditional Blocks, summary • An if test can be used to control multiple lines of commands, as in this example * print “Enter . like: #!/usr/bin /perl • Perl ignores anything after a # (this is a command not to Perl, but to the UNIX shell). • Elsewhere in the program # is used for comments to explain the code. • Lines that are Perl. statements are necessary to keep tabs on what your program is doing. • You need to tell Perl to put a carriage return at the end of a printed line – Use
in a text string to signify a newline. – The. you get started Programming Perl, 3 rd edition, by Larry Wall, O’Reilly, 2001 Why use Perl? • Interpreted language – quick to program • Easy to learn compared to most languages • Designed