Some Python String Manipulation Functions (Quick Reference)

Just putting some python string manipulation techniques in a place as quick reference


a = “a quick brown fox run over the lazy dog”


reverse string



last character of the string



first 4 character of the string



split a string with white spaces



split a string with character or string (not regular expression)



find a string inside a string





find from end of the string





count number of occurrence in a string



join several string together with separators

st = “,”

seq = (“cat”, “tiger”, “lion”)

print st.join( seq )


change the case into opposite one, if lower go to upper, if upper go lower.

a. swapcase()


replace part of the string




left, right and center justify


a.rjust(10,’ ’),’ ‘)

1 Comment

Posted by on May 4, 2013 in Programming, Python


Speedup Your Database Using Efficient Design

Here are some tips where you have to concentrate on to improve performance during database design time. It’s not whole database design but will be some tricks to improve efficiency.

  • Don’t afraid of normalization which mean don’t avoid join or try to avoid join. Joining is more efficient then denormalized.
  • Try to avoid Boolean flag.
  • Use Index cleverly (which fields will be used in search and ordering).
  • Try to avoid index on string columns.
  • Don’t index everything.
  • Do not duplicate indexes.
  • Be careful of redundant columns in an index or across indexes.
  • Normalize first, and denormalized where appropriate.
  • A NULL data type can take more room to store than NOT NULL.
  • Choose appropriate character sets & collations — UTF16 will store each character in 2 bytes, whether it needs it or not, latin1 is faster than UTF8.
  • Trigger is expensive so use it wisely.
  • Be able to change your schema without ruining functionality of your code.
  • As your data grows, indexing may change (cardinality and selectivity change). Structuring may want to change. Make your schema as modular as your code. Make your code able to scale. Plan and embrace change, and get developers to do the same.

How to Increase SQL Performance (Query Optimization)

Database and SQL performance largely depends on how one designing his database, system configuration, frequent database access type, load balancing etc. Though different database behave differently on a specific query because of their architectural difference, most of them have some identical performance increase tips. It’s basically related to SQL not on specific DB server. Here, I am trying to list some sql performance increase (query optimization) tips which I usually try to follow during SQL programming. One might not fulfill the entire requirement all the time but try to follow as much as possible.

 Try To Avoid:

  • Don’t use DISTINCT in GROUP BY clause
  • Don’t use ORDER BY in random number with too many data.
  • Don’t put ORDER BY in text/blobs/binary data
  • Try to avoid wildcards at the start or both ends of LIKE queries. It’s better to avoid LIKE queries.
  • Try to avoid IN or NOT IN, <>. Actually IN or NOT in works like OR operation.
  • Separate text/blobs/binary from metadata; don’t put text/blobs/binary in results if you don’t need them.
  • Don’t SELECT/UPDATE unnecessary data during operation, use filter accurately.
  • Avoid correlated subqueries in SELECT and WHERE clause.

 Try to Do:

  • Always try to use column(s) name in select queries instead of picking all columns using asterisks (*). Try to select as less columns as possible. But you may do calculations in select operation. This may be faster instead of doing it in application layer.
  • Try to use ORDER BY in index columns.
  • Always try to put index columns first in the WHERE and ON clause
  • Always use COUNT(*) instead of COUNT(column name). because when you put column name it checks for null and if it find null on that column it won’t count that row. btw, if you want specific column count then you may use it.
  • Use UNION operations instead of OR operations
  • Try to DELETE small amount of data at a time.
  • Always try to use join instead of multiple queries or loops.
  • Use groupwise maximum instead of subqueries.
  • Try to split a complex query and join smaller ones when necessery.
  • For better performance, Batch INSERT and REPLACE or use LOAD DATA instead of INSERT.

Posted by on May 13, 2011 in Performance Issue, SQL


Tags: , , ,

Get All Links From text/html Data Using Regular Expression (Link Extractor)

Sometime one might just want to extract all URLs from html or other text. It’s very easy to do that using regular expression. You may find a lots of regular expression to extract links from text. But I found this one ( \\(?\\b(https??://|www[.])[-A-Za-z0-9+&@#/%?=~_()|!:,.;]*[-A-Za-z0-9+&@#/%=~_()|] ) very useful, and it provided all links that I wanted. There may be some exception, but you need to found out that by using it. I hope it will give you 99% time all type of links.

Bellow you will find a java method which extract links and then put into a HashSet. This method gets an argument which will contain all text

private HashSet getAllLinks(String data){

 HashSet crawledUrlList = new HashSet();

        String lowValue = new String(data);       

        Pattern pattern = Pattern.compile(“\\(?\\b(https??://|www[.])[-A-Za-z0-9+&@#/%?=~_()|!:,.;]*[-A-Za-z0-9+&@#/%=~_()|]”);

        Matcher matcher = pattern.matcher(lowValue);


            String url =;



return crawledUrlList;


Leave a comment

Posted by on May 11, 2011 in Java, Programming


Tags: , , ,

How to create a crawler using java !!

There are two ways to crawl web pages in java.

Most primitive but original way is that to open a socket in 80 no port and then use get statement to obtain content. It works almost like telnet.

  • telnet 80
  • GET /  HTTP/1.0  (two new line)

By this way you will get content of the home page. Look at the sample method how can we do the whole procedure in java

public String urlCrawle(String url){


        StringBuffer objBuffer = new StringBuffer(“”);


           URL objURL = new URL(url);

            String host = objURL.getHost();

            String path = objURL.getPath();

            if(path.length() == 0){



            String outQuery = “GET “+path+”?”+objURL.getQuery()+” HTTP/1.0\n”;


            Socket s = new Socket(InetAddress.getByName(host), 80);

            PrintWriter out = new PrintWriter(new OutputStreamWriter(s.getOutputStream()));



            BufferedReader instream = new BufferedReader(new InputStreamReader(s.getInputStream()));

            String line = instream.readLine();

            if(line.contains(“HTTP/1.0 200”) || line.contains(“HTTP/1.1 200”)){

                while(line != null) {


                    line = instream.readLine();





        catch(Exception ex){}

        //return this.stripTagFromHtml(objBuffer.toString());


        return objBuffer.toString();


The problem of this procedure is that you have to separate hostname, path and query string and then work with them individually. And it’s quite childish to work such a way in java as this is a very dynamic and high level language. But for education purpose it’s the most ultimate way to know the underline working procedure of the system.

As java has a very wide range of network programming library, one can use URL class to do web crawling and it’s the easiest and also effective way for web crawling. You may find an example method bellow.

public String urlCrawle(String url){


        StringBuffer objBuffer = new StringBuffer();


            URL hp = new URL(url);

            URLConnection hpCon = hp.openConnection();

            int len = hpCon.getContentLength();

            String line = “”;


                BufferedReader instream = new BufferedReader(new InputStreamReader(hpCon.getInputStream()));

                line = instream.readLine();

                while(line != null) {


                    line = instream.readLine();




        catch(Exception ex){}

        //return this.stripTagFromHtml(objBuffer.toString());

        return objBuffer.toString();



Posted by on May 10, 2011 in Java, Programming


Tags: , , ,

Using the SQL Server State Management for ASP.NET Session Management

It is possible to use SQL Server for session state management, especially if session state is of critical importance to our application and we can’t afford to lose the session state for a user.

Follow the steps below to use SQL Server to manage state.

  • Open the Web.Config file in the Visual Studio.NET editor.
  • Locate the <sessionState> XML element.
  • Change the Mode attribute to SQLServer.
  • Make sure the Cookieless attribute is set to true.
  • Change the sqlConnectionString attribute so the Data Source expression refers to server. Add valid User Id and Password values, as well. We do not need to specify the name of a database, as the tables that manage state are located in Tempdb.




sqlConnectionString=”data source=(local);user





After setting these attributes, we need to create the ASPState database with some stored procedures that the .NET Framework will use to manage state.

Follow these steps to complete the installation:

  • Find the file named  InstallSqlState.sql located in our <systemdrive>\Winnt\Microsoft.NET\Framework\<Version> folder.
  • Load the InstallSqlState.sql file into SQL Query Analyzer and execute the statements. This creates the ASPState database and all of the appropriate stored procedures.

After we have done these things, try running our sample that creates a session variable. We can stop and start the IIS and Web Publishing Services, and once again, our state will remain. If we open SQL Server Enterprise Manager and navigate to the Tempdb database, we will find a table named ASPStateTempSessions. Open this table and we will find a record with our session ID, the time this session was created, and when this session will expire. We will also see several binary fields. These fields contain the data for the session. We won’t be able to look at this data, but then we don’t really need to because the .NET Framework takes care of all of this for us automatically.

Issues with Automatic SQL Server State Management

Although using SQL Server to store our session state relieves us of many difficult development issues, we’ll still need to consider some important limitations:

  • Limited of SQL Server State: This technique can only use SQL Server, no other server database. If we do not have a SQL Server installation available, we will be unable to use this solution.
  • Performance may suffer: Like any of the state management techniques, using SQL Server to manage our application’s state can cause our performance degrade a little. Because it takes a little bit of time to make a connection and read and write state information in the database, there’s no avoiding a small bit of overhead.

Tags: , , , , ,

Using the ASP.NET State Service for State Management in ASP.NET Among Several Server

One of the problems about earlier was session state management across a Web farm. When users come into a Web site with several servers that can serve a particular user at any time, the session state does not automatically carry over from machine to machine. Another problem with session state is that it typically runs in the same process space as the IIS service. As a result, if our IIS service goes down, we also lose all session states.

If we wish to solve both of these problems, we need to move the session state management to an out-of-process component that is separate from the IIS service. When the .NET Framework is installed, a new service called ASP.NET State is installed on our server. This service manages the Session object in a separate process. This separate process can be located on the same machine as IIS, or on a separate machine.

If we choose to use a separate server to be the state management machine, all the servers in our Web farm use this machine to store and retrieve state for a user. No matter which machine serves the user, the state is maintained on a separate machine, one from which each of these Web servers can retrieve data.

We are probably thinking that this is going to be very difficult to set up and maintain. Nothing could be further from the truth! In fact, all it takes is to change one setting in each Web server’s Web.Config file.

Follow the steps below to enable an out-of-process session state manager.

  • Using the Services applet, start the ASP.NET State Service.
  • Open application’s Web.Config file in the Visual Studio.NET editor.
  • Locate the <sessionState> XML element.
  • Change the Mode attribute from InProc to StateServer.
  • Make sure the Cookieless attribute is set to true.




sqlConnectionString=”data source=;user





After setting this attribute, we can test our changes:

  • Run some code that creates a session variable.
  • Stop and re-start the IIS Admin and Web Publishing Services.
  • Go back to a page that retrieves the session variable that we previously set: we’ll find that the saved variable still exists.

If we will be using a separate machine for state management, make sure that the ASP.NET State service is running on this other machine. Next, set the stateConnectionString attribute in the <sessionstate> XML element to the name or IP address of the machine that will manage the state.

TIP:  By default, the stateConnectionString attribute is set to, which corresponds to the current machine. If we want to manage state on a separate machine, we’ll need to modify this value.

Issues With the ASP.NET State Service

There are some issues with using this ASP.NET State service:

  • Performance: The performance of retrieving state from an out-of-process service will be slower than from an in-process service. If we are retrieving data across the network to a state server, we also have the network traffic to contend with. This can slow our retrieval of state significantly.
  • Redundancy: If we use another machine to manage state, we will need to set up some redundancy for this machine in case it crashes. Of course this redundancy will not help us if the original machine dies, because all of the session data is stored in memory.

Tags: , , , ,