Sequences and Their Notations

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang	30
Dung lượng	1,06 MB

Nội dung

Sequences and Their Notations tài liệu, giáo án, bài giảng , luận văn, luận án, đồ án, bài tập lớn về tất cả các lĩnh vự...

Platination of telomeric sequences and nuclease hypersensitive elements of human c-myc and PDGF-A promoters and their ability to form G-quadruplexes Viktor Viglasky Department of Biochemistry, Faculty of Sciences, Institute of Chemistry, P. J. Safarik University, Kosice, Slovakia G-rich regions appear in several locations in the human genome, including at the ends of linear chrom- omes, the immunoglobin switch region, centromeres, fragile X syndrome repeats, and promoters of some genes [1]. The sequences repeated in tandem, with three or four adjacent guanines, have been known to form polymorphic quadruplexes containing G-quartets stabi- lized by cyclic Hoogsteen hydrogen bondings. Quadru- plex structures are highly stable DNA or RNA structures formed on G-rich sequences [2]. The Na + and K + ions stabilize the stacking through their inter- actions with carbonyl oxygens of the eight guanines of two adjacent quartets [3]. Direct evidence for the presence of G-quadruplex structures in vivo has been reported both at the telomeres of the ciliate Stylonychi- a [4] and those of humans [5], and at the promoter of c-myc [6,7]. Moreover, other genomic regions were shown to be able to adopt quadruplex structures, such as the promoters of c-kit oncogene [6], HIF-1a [9], Bcl2 [10] and vascular endothelial growth factor [11]. The stabilization of the G-quadruplex structure by small molecules is currently emerging as a very promis- ing anti-cancer strategy. Therefore, molecules that stabilize G-quadruplex structures can be used as potential anti-cancer agents [12]. Indeed, recent studies strongly suggest that molecules able to stabilize the quadruplex structure of DNA can lead to an arrest of the prolifera- tion of cancer cells [5,12–14]. At each division of somatic cells, telomeres are shortened, a process leading to senescence and death. It has been shown in vitro that G-quadruplex structures of the human sequence (G 3 T 2 A) 3 G 3 formed in the presence of molecules stabi- lizing the G-quartet stacks, similar to anthraquinones or porphyrins, inhibit the activity of telomerase [13–17]. The anti-tumor drug cisplatin (cis-[PtCl2(NH3)2]), known for its high affinity for G-rich sequences, was Keywords cisplatin; c-myc; G-quadruplex; PDGF-A promoter; telomeric sequences Correspondence V. Viglasky, Department of Biochemistry, Faculty of Sciences, Institute of Chemistry, Safarik University, Moyzesova 11, 04011 Kosice, Slovakia Fax: +421 55 622 21 24 Tel: +421 55 234 12 62 E-mail: viktor.viglasky@upjs.sk (Received 30 August 2008, revised 5 November 2008, accepted 7 November 2008) doi:10.1111/j.1742-4658.2008.06782.x Naturally occurring G-rich DNA sequences that are able to form G-quadruplex structures appear as potential targets for anti-cancer chemotherapy, and therefore play an important role in cellular processes, such as cell aging, death and carcinogenesis. The telomeric regions of DNA and nuclease hypersensitive elements of human c-myc and PDGF-A promoters represent a very appealing target for cisplatin and may interfere with normal DNA function. Platinum complexes bind covalently to nucleobases, and especially to the N7 atom of guanines, and the four guanines of a G-quartet have their N7 atoms involved in hydrogen bonding. Therefore, within a G-quadruplex structure, only the guanines out of the stack of G-quartets should react with electrophilic species such as platinum (II) complexes. Platinum Sequences and Their Notations Sequences and Their Notations By: OpenStaxCollege A video game company launches an exciting new advertising campaign They predict the number of online visits to their website, or hits, will double each day The model they are using shows hits the first day, hits the second day, hits the third day, and so on See [link] Day … Hits 16 32 … If their model continues, how many hits will there be at the end of the month? To answer this question, we’ll first need to know how to determine a list of numbers written in a specific order In this section, we will explore these kinds of ordered lists Writing the Terms of a Sequence Defined by an Explicit Formula One way to describe an ordered list of numbers is as a sequence A sequence is a function whose domain is a subset of the counting numbers The sequence established by the number of hits on the website is {2, 4, 8, 16, 32, … } The ellipsis (…) indicates that the sequence continues indefinitely Each number in the sequence is called a term The first five terms of this sequence are 2, 4, 8, 16, and 32 Listing all of the terms for a sequence can be cumbersome For example, finding the number of hits on the website at the end of the month would require listing out as many as 31 terms A more efficient way to determine a specific term is by writing a formula to define the sequence One type of formula is an explicit formula, which defines the terms of a sequence using their position in the sequence Explicit formulas are helpful if we want to find a specific 1/30 Sequences and Their Notations term of a sequence without finding all of the previous terms We can use the formula to find the nth term of the sequence, where n is any positive number In our example, each number in the sequence is double the previous number, so we can use powers of to write a formula for the nth term The first term of the sequence is 21 = 2, the second term is 22 = 4, the third term is 23 = 8, and so on The nth term of the sequence can be found by raising to the nth power An explicit formula for a sequence is named by a lower case letter a, b, c with the subscript n The explicit formula for this sequence is an = 2n Now that we have a formula for the nth term of the sequence, we can answer the question posed at the beginning of this section We were asked to find the number of hits at the end of the month, which we will take to be 31 days To find the number of hits on the last day of the month, we need to find the 31st term of the sequence We will substitute 31 for n in the formula a31 = 231 =2,147,483,648 If the doubling trend continues, the company will get 2,147,483,648 hits on the last day of the month That is over 2.1 billion hits! The huge number is probably a little unrealistic because it does not take consumer interest and competition into account It does, however, give the company a starting point from which to consider business decisions Another way to represent the sequence is by using a table The first five terms of the sequence and the nth term of the sequence are shown in [link] n n nth term of the sequence, an 16 32 2n Graphing provides a visual representation of the sequence as a set of distinct points We can see from the graph in [link] that the number of hits is rising at an exponential rate This particular sequence forms an exponential function 2/30 Sequences and Their Notations Lastly, we can write this particular sequence as {2, 4, 8, 16, 32, … , 2n, … } A sequence that continues indefinitely is called an infinite sequence The domain of an infinite sequence is the set of counting numbers If we consider only the first 10 terms of the sequence, we could write {2, 4, 8, 16, 32, … , 2n, … , 1024} This sequence is called a finite sequence because it does not continue indefinitely A General Note Sequence A sequence is a function whose domain is the set of positive integers A finite sequence is a sequence whose domain consists of only the first n positive integers The numbers in a sequence are called terms The variable a with a number subscript is used to represent the terms in a sequence and to indicate the position of the term in the sequence a1, a2, a3, … , an, … 3/30 Sequences and Their Notations We call a1 the first term of the sequence, a2 the second term of the sequence, a3 the third term of the sequence, and so on The term an is called the nth term of the sequence, or the general term of the sequence An explicit formula defines the nth term of a sequence using the position of the term A sequence that continues indefinitely is an infinite sequence Q&A Does a sequence always have to begin with a1 ? No In certain problems, it may be useful to define the initial term as a0instead of a1 In these problems, the domain of the function includes How To Given an explicit formula, write the first n terms of a sequence Substitute each value of n into the formula Begin with n = to find the first term, a1 To find the second term, a2, use n = Continue in the same ...EXTENDING GENERALIZED FIBONACCI SEQUENCES AND THEIR BINET-TYPE FORMULA MUSTAPHA RACHIDI AND OSAMU SAEKI Received 10 March 2006; Accepted 2 July 2006 We study the extension problem of a g iven sequence defined by a finite order recurrence to a sequence defined by an infinite order recurrence with per iodic coefficient sequence. We also study infinite order recurrence relations in a strong sense and give a complete answer to the extension problem. We also obtain a Binet-type formula, answering several open questions about these sequences and their characteristic power series. Copyright © 2006 M. Rachidi and O. Saeki. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. 1. Introduction The notion of an ∞-generalized Fibonacci sequence (∞-GFS) has been int roduced in [7] and studied in [1, 8, 10]. This class of sequences defined by linear recurrences of infinite order is an extension of the class of ordinary (weighted) r-generalized Fibonacci sequences (r-GFSs) with r finite defined by linear recurrences of rth order (e.g., see [3–6, 9], etc.). Such sequences are defined as follows. Let {a i } ∞ i=0 and {α −i } ∞ i=0 be two sequences of com- plex numbers, where a i = 0forsomei. The associated ∞-GFS {V n } n∈Z is defined by V n = α n if n ≤ 0, (1.1) V n = ∞  i=0 a i V n−i−1 if n ≥ 1. (1.2) The sequences {a i } ∞ i=0 and {α −i } ∞ i=0 are called the coefficient sequence and the initial sequence, respectively. As is easily observed, the general terms V n may not necessarily exist. In [1], necessary and sufficient conditions for the existence of the general terms have been studied. When there exists an r ≥ 1suchthata i = 0foralli ≥ r, we call the sequence {V n } n≥−r+1 an r-GFS with initial sequence {V −r+1 ,V −r+2 , ,V 0 }.Foranr-GFS, Hindawi Publishing Corporation Advances in Difference Equations Volume 2006, Article ID 23849, Pages 1–11 DOI 10.1155/ADE/2006/23849 2 Extending generalized Fibonacci sequences the numbering often starts with V 1 instead of V −r+1 . In such a case, all the numberings shift by r. Thecasewherethecoefficient sequence {a i } ∞ i=0 is periodic, that is, the case where there exists an r ≥ 1suchthata i+r = a i for every i ≥ 0 is considered in [2]. It was shown that in such a case, the associated ∞-GFS is an r-GFS associated with the coefficient sequence  a 0 ,a 1 , ,a r−2 ,a r−1 +1  , (1.3) and the initial sequence {V 1 ,V 2 , ,V r },wherer ≥ 1, is the period. Thus, the following problem naturally arises. Given an r-GFS, can one always extend it to an ∞-GFS associated with a periodic coefficient sequence? If it is not always the case, then characterize those r-GFSs which can be extended to an ∞-GFS associated with a pe riodic coefficient sequence. In this paper, we first show that under a mild condition on the coefficients, an r- GFS can always be extended to an ∞-GFS associated with a p eriodic coefficient sequence (Proposition 2.1). On the other hand, it was shown that a root of the characteristic polynomial of an r-GFS does not always give an ∞-GFS associated with a periodic coefficient sequence (see [2, Example 3.4]). In order to analyze this type of phenomena, in Section 3,weintroduce the notion of a strongly ∞-GFS, imposing the condition (1.2) not only for n ≥ 1, but for all n ∈ Z. In a sense, this condition is more natural than requiring the equation only for n ≥ 1, and it has already appeared in [7, Problem 3.11]. The main result of this paper is a characterization theorem of those r-GFSs which can be extended to a strongly ∞- GFS associated with a periodic coefficientsequence(Theorem 3.2). This gives a complete solution to the problem mentioned above Hindawi Publishing Corporation EURASIP Journal on Bioinformatics and Systems Biology Volume 2006, Article ID 35809, Pages 1–8 DOI 10.1155/BSB/2006/35809 Multipattern Consensus Regions in Multiple Aligned Protein Sequences and Their Segmentation David K. Y. Chiu and Yan Wang Department of Computing and Information Science, University of Guelph, Guelph, ON, Canada N1G 2W1 Received 23 November 2005; Revised 22 May 2006; Accepted 7 June 2006 Recommended for Publication by John Quackenbush Decomposing a biological sequence into its functional regions is an important prerequisite to understand the molecule. Using the multiple alignments of the sequences, we evaluate a segmentation based on the type of statistical variation pattern from each of the aligned sites. To describe such a more general pattern, we introduce multipattern consensus regions as segmented regions based on conserved as well as interdependent patterns. Thus the proposed consensus region considers patterns that are statistically significant and extends a local neighborhood. To show its relevance in protein sequence analysis, a cancer suppressor gene called p53 is examined. The results show significant associations between the detected regions and tendency of mutations, location on the 3D structure, and cancer hereditable factors that can be inferred from human twin studies. Copyright © 2006 D. K. Y. Chiu and Y. Wang. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. 1. INTRODUCTION Decomposing a sequence into regions can be extremely important in understanding the functional characteristics of the biomolecule. Performing this using multiple alignments of the sequence family can dramatically improve the reliability of the interpretation, as wel l as capturing the overall property beyond the original sequence. Thus consensus sequence, or frequency pattern along a segment across multiple aligned sequences, provides a convenient characteristic to indicate a commonly observed, and likely an intrinsic property of the sequences. A well-known example is the TATA binding protein, a DNA sequence (consensus TATAAA) upstream of the transcription start site in the promoter region of many eu- karyotic genes. In addition, the notion of consensus structure (see Chiu and Kolodziejczak [1], Chiu and Harauz, [2]), proposed in the early 1990’s, captures a different feature discovered from multiple aligned sequences. It confirms that a jointly inferred 2D, and even 3D structure, can be in some cases recovered from the aligned sequences, see Chiu and Harauz [2]. In these cases, the multiple aligned sequences can be treated as a sample observation of the sequence family. The detected pattern is analogous to an estimated overall feature of the biomolecules from the sequences. In this paper, we extend the notion further to propose multipattern consensus region that generalizes consensus sequence that has been found to be extremely useful in sequence analysis. A multipattern consensus region is defined as a region segment given the multiple alignments of the sequences so that the segment is dominated by sites that are conserved or, in another instance, interdependent pattern characteristics. To define the patterns more rigorously, the patterns are detected based on statistical test of significance, r ather than frequency count. Note that multipattern consensus region STUDY OF THE RELATIONSHIP BETWEEN Mus musculus PROTEIN SEQUENCES AND THEIR BIOLOGICAL FUNCTIONS A Thesis Presented to The Graduate Faculty of The University of Akron In Partial Fulfillment of the Requirements for the Degree Master of Science Pawan Seth May, 2007 ii STUDY OF THE RELATIONSHIP BETWEEN Mus musculus PROTEIN SEQUENCES AND THEIR BIOLOGICAL FUNCTIONS Pawan Seth Thesis Approved: Accepted: _______________________________ _______________________________ Advisor Dean of the College Dr. Zhong-Hui Duan Dr. Ronald F. Levant _______________________________ _______________________________ Committee Member Dean of the Graduate School Dr. Chien-Chung Chan Dr. George R. Newkome _______________________________ _______________________________ Committee Member Date Dr. Xuan-Hien Dang _______________________________ Committee Member Dr. Yingcai Xiao _______________________________ Department Chair Dr. Wolfgang Pelz iii ABSTRACT The central challenge in post-genomic era is the characterization of biological functions of newly discovered proteins. Sequence similarity based approaches infer protein functions based upon the homology between proteins. In this thesis, we present the similarity relationship between protein sequences and functions for mouse proteome in the context of gene ontology slim. The similarity between protein sequences is computed using a novel measure based upon the local BLAST alignment scores. The similarity between protein functions is characterized using the three gene ontology categories. In the study, the ontology categories are represented using a general tree structure. Three ontology trees are constructed using the definitions provided in gene ontology slim. The mouse protein sequences are then mapped onto the trees. We present the sequence similarity distributions at different levels of GO tree. The similarities of protein sequences across gene ontology levels and traversing branches are studied. The posterior probabilities for correct predictions are calculated to study the mathematical underpinnings in evaluating the similarities between the protein sequences. Our results indicate that proteins with similar amino acid sequences have similar biological functions. Although the similarity distribution in each functional group across GO levels varies from one functional group to another, the comparison between distributions of parent and child groups reveals the strong relationship between sequence and function similarity. We conclude that sequence similarity approach can function as a key measure iv in the prediction of biological functions of unknown proteins. Our results suggest that the posterior probability of a correct prediction could also serve as one of the key measures for protein function prediction. v ACKNOWLEDGEMENTS I would like to express my sincere appreciation to my advisor, Dr. Zhong-Hui Duan, for her constant encouragement and invaluable guidance during this study. I am grateful to her for offering me an opportunity to do my thesis under her. I am very impressed by her kindness and personality. This thesis and my study in Computer Science Department would not have been possible without her help and support. I would also like to acknowledge the help of Computer Science Department for offering me an assistantship. I would also like to acknowledge the help from Dr. Wolfgang Pelz, Dr. Yingcai Xiao, Dr. Timothy W. O’Neil, Dr. Xuan-Hien Dang, Dr. Chien-Chung Chan, Dr. K.J. Liszka and Ms. Peggy Speck for their constant assistance. I would like to dedicate this thesis to my family. Without their encouragement, love and support, I do not think I can finish this degree, this thesis and the study at the University of Akron. I am forever indebted to them, for the sacrifices they make to help me to achieve this success. vi TABLE OF CONTENTS Page LIST OF TABLES Chapter 2: Attackers and Their AttacksSecurity+ Guide to Network Security Fundamentals Second Edition Objectives•Develop attacker profiles•Describe basic attacks•Describe identity attacks•Identify denial of service attacks•Define malicious code (malware) Developing Attacker Profiles•Six categories: –Hackers–Crackers–Script kiddies–Spies–Employees–Cyberterrorists Developing Attacker Profiles (continued) Hackers•Person who uses advanced computer skills to attack computers, but not with a malicious intent•Use their skills to expose security flaws •Person who violates system security with malicious intent •Have advanced knowledge of computers and networks and the skills to exploit them•Destroy data, deny legitimate users of service, or otherwise cause serious problems on computers and networksCrackers •Break into computers to create damage•Are unskilled users•Download automated hacking software from Web sites and use it to break into computers•Tend to be young computer users with almost unlimited amounts of leisure time, which they can use to attack systemsScript Kiddies •Person hired to break into a computer and steal information•Do not randomly search for unsecured computers to attack•Hired to attack a specific computer that contains sensitive informationSpies •One of the largest information security threats to business•Employees break into their company’s computer for these reasons:–To show the company a weakness in their security–To say, “I’m smarter than all of you”–For moneyEmployees •Experts fear terrorists will attack the network and computer infrastructure to cause panic •Cyberterrorists’ motivation may be defined as ideology, or attacking for the sake of their principles or beliefs•One of the targets highest on the list of cyberterrorists is the Internet itselfCyberterrorists [...]...• Person hired to break into a computer and steal information • Do not randomly search for unsecured computers to attack • Hired to attack a specific computer that contains sensitive information Spies • Category of attacks in which the attacker attempts to assume the identity of a valid user Examining Identity Attacks • Similar to an active man-in-the-middle attack • Whereas... contents of a message before sending it on, a replay attack only captures the message and then sends it again later • Takes advantage of communications between a network device and a file server Replay Summary (continued) • Identity attacks attempt to assume the identity of a valid user • Denial of service (DoS) attacks flood a server or device with requests, making it unable to respond to valid... Identifying Denial of Service Attacks (continued) • Policies to minimize password-guessing attacks: – Passwords must have at least eight characters – Passwords must contain a combination of letters, numbers, and special characters – Passwords should expire at least every 30 days – Passwords cannot be reused for 12 months – The same password should not be duplicated and used on two or more systems Password... organization dropping below a specified level Password Guessing (continued) Objectives • Develop attacker profiles • Describe basic attacks • Describe identity attacks • Identify denial of service attacks • Define malicious code (malware) Identifying Denial of Service Attacks (continued) • Another DoS attack tricks computers into responding to a false request • An attacker can send a request to all... overwhelming it, and causing the server to crash or be unavailable to legitimate users • Encryption: changing the original text to a secret message using cryptography • Success of cryptography depends on the process used to encrypt and decrypt messages • Process is based on algorithms Weak Keys (continued) Summary • Six categories of Series and Their ... n +n 23/30 Sequences and Their Notations an = { 4+n 2n if n is even + n if n is odd a1 = 2, an = ( − an − + 1) an = 1, an = an − + an = (n + 1) ! (n − 1) ! 24/30 Sequences and Their Notations. .. 25/30 Sequences and Their Notations an = 2n − For the following exercises, write a recursive formula for the sequence using the first five points shown on the graph 26/30 Sequences and Their Notations. .. a2 = 0, an = an − − an − 29/30 Sequences and Their Notations (n + 2) ! Calculate the first eight terms of the sequences an = (n − 1) ! and bn = n3 + 3n2 + 2n, and then make a conjecture about

Ngày đăng: 31/10/2017, 16:53

Xem thêm