Scriptome Home
UNIX/Mac Home
Windows Home
Information
FAQ
Help
Overview
Principles
Resources
Tips
Tools
Calc
Change
Choose
Fetch
Merge
Sort
Protocols
Sequences
Microarray

Quickbrowse: Go to a tool by selecting the abbreviated tool name from the menu.

fetch_

Contents: Click a blue triangle to expand or collapse a list


Fetch files or sequences

The tools in this section get things, like files or sequences - e.g., from the Web.

To use a script, cut and paste the code from the light green or blue box into a terminal window, change the bold, red text as needed, and hit Enter.

See More Information for notes on using these tools.

Fetch a sequence from a popular Internet database

Fetch a sequence from a popular Internet database (fetch_sequence_web)

Gets a sequence with a given id from a given database. (The database must be one of: swiss, genbank, genpept, embl, refseq.) The format of the fetched sequence (fasta by default) can be embl, fasta, gcg, genbank, swiss, or a whole bunch of other formats: see

The Bioperl SeqIO HOWTO

for details.

This script requires Bioperl to be installed (on whichever machine the script runs on). Many biology computers will have it installed. If the script breaks because it "can't locate Bio/Perl.pm", you can download Bioperl from bioperl.org.

$database Database name
$id Identifier
$format Format to write sequence in
Output file

perl -MBio::Perl -e ' $database="embl"; $id="AI129902"; $format="fasta"; $sequence = get_sequence($database, $id); write_sequence(">-", $format, $sequence); warn "Wrote $database sequence $id in $format format\n"; ' > seq.fasta

Example: Get the ROA1_HUMAN sequence from Swiss-Prot in FASTA format, and put it in seq.fasta by running the above script.

Output file (seq.fasta) Screen Output
 >AI129902; qc41b07.x1 Soares_pregnant_uterus_NbHPU Homo sapiens cDNA [etc.]
 CTCCGCGCCAACTCCCCCCACCCCCCCCCCACACCCC
 Wrote embl sequence AI129902 in fasta format

Get a file from the Web

Fetch a file from the web (fetch_file_web)

Given an http or ftp address, get a file and store it in a given filename. This assumes you have an Internet connection, the file exists, etc. If something breaks, it should print an error message.

$web_file Web address
$store Name of file to save to

perl -MLWP::Simple -e ' $web_file="ftp://ftp.ncbi.nih.gov/genbank/GB_Release_Number"; $store="GB.txt"; if (is_success(getstore($web_file, $store))) { warn "Downloaded $web_file into $store\n"; } else { warn "Error downloading $web_file\n" } '

Example: Run the above script to download the current GenBank release number to a file GB.txt. The resulting file will have in it one line, giving the release number (as of this writing, 151).

Example 2: Download the NCBI home page by setting $web_file to "http://ncbi.nih.gov" and $store to "ncbi.html".


More Information

General Fetching Notes

As always, when in doubt, check your output files after each step!

Scripts that need to fetch information from the Web will of course break if your Internet connection isn't working.

General Scriptome Notes

Scriptome tools are in blue or green boxes. Cut and paste the text of the tool into a terminal window. Then edit the line as needed. Things that will often need to be edited are highlighted in red. Input and output filenames will almost always need to be changed.

All scripts that work on tabular data assume the data is tab-separated. Use a Change script to change, e.g., comma-separated data to tab-separated before using these scripts.

When working with tabular data, remember that the first column is called column 0, the second column is column 1, etc. The last column can also be referred to as column -1, second-to-last column is -2, etc.

See Also

something_related

Do something related to merging that isn't on this page.

 

HomeContact UsDirectoriesSearch