1. 기본개념들
1.1 종류
A Adenine
C Cytosine
G Guanine
T Thymine
1.2 페어
G-C | C-G
A-T | T-A
- ACGT 이외의 알파벳이 나타내는 의미
Y Pyrimidine (C or T)
R Purine (A or G)
W Weak (A or T)
S Strong (G or C)
K Keto (T or G)
M Amino (C or A)
D A, G, T (not C - remember as after C)
V A, C, G (not T - remember as after T/U - We'll get to "U" soon)
H A, C, T (not G - remember as after G)
B C, G, T (not A - remember as after A)
N Any base
- Gap
Certain groups of nucleotides that share some common attribute may be designated by so called ambiguity codes — biostar handbook
1.3 Genome
- coding region vs junk region, (controversy : C Value Paradox)
- sizes:
Ebola virus genome: 18 thousand basepairs (18Kb)
E.coli bacteria genome: 4 million basepairs (4Mb)
Baker's yeast fungus genome: 12 million basepairs (12Mb)
Fruit fly genome: 120 million basepairs (120Mb)
Human genome: 3 billion basepairs (3Gb)
Some salamander species: 120 billion basepairs (120 Gb)
1.4 RNA
- T (Thymine) is replaced by the nucleotide U (Uracil)
- 따라서 ACGU
T --> U
- mRNA, tRNA, rRNA, etc
1.5 Protein
- 20종류
ARNDCEQGHILKMFPSTWYZ
- DNA → mRNA → Protein
- Translation table (아래는 예시)
CGU, CGC, CGA, CGG, AGA, AGG --> Arginine (Arg/R)
1.6 ORF
- open reading frame(at least, say, 100 consecutive codons without a stop
codon)
1.7 Gene
A region (or regions) that includes all of the sequence elements necessary to encode a functional transcript. A gene may include regulatory regions, transcribed regions and other functional sequence regions.
1.8 Other region
- UTR(Untranslated regions)
- Promoter regions
- CpG islands
- Enhancers
1.9 Homology
- Two regions of DNA that evolved from the same sequence
Homology is not a synonym of sequence similarity!
Homologous sequences are usually similar to one another, but similarity of sequences does not indicate homology.