zoukankan      html  css  js  c++  java
  • 187. Repeated DNA Sequences

    题目:

    All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.

    Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

    For example,

    Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT",
    
    Return:
    ["AAAAACCCCC", "CCCCCAAAAA"].

    链接: http://leetcode.com/problems/repeated-dna-sequences/  

    题解:

    求repeating molecule of DNA sequence。最直接的想法就是从头遍历,把substring加入到HashMap中,然后进行比较。这样的话因为substring()的复杂度是O(n),所以整个算法复杂度是O(n2)。看到有讨论用Rabin-Karp的Rolling Hash自己做Hash Function。这样做的好处我觉得可能是减少了substring()的次数,但总得来说时间复杂度也还是O(n2)。而且做了Rabin-Karp的话要不要要不要判断collision,collision以后用Monte-carlo还是Las Vegas检测结果,也是问题。要再研究一下。

    Time Complexity - O(n2), Space Complexity - O(n)。

    public class Solution {
        public List<String> findRepeatedDnaSequences(String s) {
            List<String> res = new ArrayList<String>();
            if(s == null || s.length() < 10)
                return res;
            Map<String, Integer> map = new HashMap<>();
            
            for(int i = 0; i < s.length() - 9; i++) {
                String subStr = s.substring(i, i + 10);
                if(map.containsKey(subStr)) {
                    if(map.get(subStr) == 1)
                        res.add(subStr);    
                    map.put(subStr, map.get(subStr) + 1);
                } else
                    map.put(subStr, 1);
            }
        
            return res;
        }
    }

    二刷:

    还是用的老方法,利用HashMap进行比较。

    看到Stefan的用Set也能做,更像Python的风格,速度更快。

    Rolling Hash的话,Time Complexity是O(n),但每次都要计算10个数的hash value,速度也不快。有机会的话联系一下好了。

    Java:

    HashMap:

    Time Complexity - O(n2), Space Complexity - O(n)。

    public class Solution {
        public List<String> findRepeatedDnaSequences(String s) {
            List<String> res = new ArrayList<>();
            if (s == null) return res;
            Map<String, Integer> map = new HashMap<>();
            for (int i = 0; i + 10 <= s.length(); i++) {
                String str = s.substring(i, i + 10);
                if (!map.containsKey(str)) {
                    map.put(str, 1);
                } else {
                    if (map.get(str) == 1) res.add(str);
                    map.put(str, 2);
                }
            }
            return res;
        }
    }

    Reference:

    https://leetcode.com/discuss/24595/short-java-rolling-hash-solution

    https://leetcode.com/discuss/24557/just-7-lines-of-code

    https://leetcode.com/discuss/24478/i-did-it-in-10-lines-of-c

    https://leetcode.com/discuss/25399/clean-java-solution-hashmap-bits-manipulation

    https://leetcode.com/discuss/25536/am-understanding-the-problem-wrongly-what-about-aaaaccccca

    https://leetcode.com/discuss/29623/11ms-solution-with-unified-hash-fxn

    https://leetcode.com/discuss/54777/easy-to-understand-java-solution-with-well-commented-code

    https://leetcode.com/discuss/46948/accepted-java-easy-to-understand-solution

    https://leetcode.com/discuss/64841/7-lines-simple-java-o-n

  • 相关阅读:
    数据库架构的演变
    一个简单的跨库事务问题
    一个优美的架构需要考虑的几个问题
    铁道部新客票系统设计
    详细介绍软件架构设计的三个维度
    单代号网络图
    分库分表带来的完整性和一致性问题
    软件架构设计箴言理解
    设计高并发的电子商店
    mysql之索引补充
  • 原文地址:https://www.cnblogs.com/yrbbest/p/4491657.html
Copyright © 2011-2022 走看看