Tuesday, July 10, 2012

Source Control Management

Source Control Management (SCM) also called a Version Control System (VSC) is the cornerstone of enterprise software development. Simply put, it provides an easy means for developers to manage their source code in terms of sharing, tracking and security - giving teams (and individual coders) the flexibility that they need to produce high quality reliable code with the safety that they know all changes to code can be retrieved/monitored and undone if necessary. The alternative is to keep the source code in a folder somewhere on a network, but then think how hard it would be to do the following:

  • Keep a history of changes to files
  • Allow multiple developers to work on a file at once
  • View the differences between files
  • Allow for multiple different features to be developed simultaneously
  • Allow for whole code streams to be merged together easily
  • Create snapshots the state of the source at a point in time (so you know what’s deployed)
  • Track who made what changes and when (useful for determining when a bug was introduced)
  • lots lots more…..

So what exactly is a SCM tool

It’s a tool that you download and install on a server (internal to a household or company) or a cloud based service that you connect to – that is a secure database of all your code (called a repository). The administrator sets up users on the system and provides the users with a URL to connect to. The end-users then simply download the client tool onto their PC and pull the code code they want to work on onto their machine. When they have made a change they’re happy with they push it back to the repository. Other users then can get your changes when they update their local repository. The beauty is the system helps manage changes like if another user also edited the same file either resolving them automatically or allowing the user to choose the differences to use. You’ll find my personal repository on Github.

Each SCM system has it’s own guidelines on how to layout your branches and tree structures so it’s always a good idea to read the best practices on the system you use (e.g. Subversion Best Practices).

Tip: Try a few services for yourself on your own code to decide on which you prefer before bringing one option into work.

 

Terminology

Repository the database containing the source code
Trunk/Master the main stream of code, usually defined what is currently in production. When a new feature is to be added, the developer will create a branch from this Trunk/Master and when it is ready to be released -merge it back into trunk/master
Branching the process of copying the code into two separate areas so that new features can be worked on independently, thus if one is finished before the other - they can be tested and released separately.
Merging the process where two branched codebases are merged together into a parent stream.
Forking forking is similar to branching but usually refers to creating a branch that may never rejoin the master - as in forking the open source repository of Linux to make your own version.
Tagging this is a way of recording the state of your repository so that you can always get to a certain point in the code if needed.
Checkout/Clone when a developer pulls the latest code from the repository to their machine
Checkin/Commit/Push when a developer wants the code they worked on to be save back to a central repository

Types of Source Control Management

Centralized

This is the most common type of system used by large companies. It's basically a central server (with proxies for scalability) that holds all your code in a database with a management system on top of it that allows your code to be edited/forked/branched/merged/tagged. Users download the client software onto their development machines and connect to the central server. All code required is the downloaded onto the clients machine and the developer works on the code as needed. Merging facilities are provided by the SCM system allowing several developers to working on the same codebase simultaneously.

Distributed

These are becoming more popular these days as they provide a way for developers to work off line. Essentially, every developer has their own code repository locally and can link to remote repositories pulling in other peoples changes or requesting their own to be added into a main stream. Take for example if a developer wanted to work on an Open Source project that was available on an online distributed repository. The developer would fork the main repository and make changes to that codebase. If the the developer decided that their changes were ready for the main product they would make a request to the main branch owner for their branch to be merged into the main stream.

Popular SCM Tools

There are dozens of free and paid for solutions available with the most popular these days being Subversion and Git – here are a few:

Subversion (svn) This seems to be the most popular centralized SCM at the moment with a lot of support and tools. I like it a lot. There are many free cloud based versions of this that provide free single user experience
Git This is my current SCM of choice and is a popular decentralized system. It’s used a lot in open source development due to it’s ease of forking and remerging. This has a steep learning curve if your coming from the likes of SVN, but is worth it.
Github This is a centralized version of above, I use it a lot and there are free versions available.
Accurev A graphical management system that allows features to be added and removed via a drag and drop UI. Have not used this but heard only good reviews.
Mercurial (hg) A lot of big systems use this and it seems very popular. I’ve only ever pulled code from it to build locally
Perforce I used this years ago and was very happy with it. Provides many features and very scalable.
Visual Sourcesafe Microsoft's SCM that is now part of Team System. A very good system with excellent reporting and plugins for SCRUM systems.
CVS The old school that people are moving from. Never used it directly.

Friday, March 9, 2012

How to recover a Raid 0 (Striped) that isn't recognised as a Raid Volume

If your PC refused to boot and you got an error that said:
    RAID Volumes
    None Defined
or looked similar to the image below, then your not alone seems to be a common problem on the interweb that I have never found a good answer to... so here's what i had to do to get my data back using a Windows 7 PC.


First off, this post assumes that both your hard drives have not suffered any serious mechanical failure and that most of the data is accessible (like if the master boot record of the drives got corrupted)

NOTE: when working with damaged drives it's really important to use the drive as little as possible as you don't want to degrade the data any further.

Here's a summary of what i did:

- Connect each drive to another PCs motherboard (or boot same PC with a ramdisk like WinPE)
- Image both Hard Disk Drives (HDDs)
- Destripe (Reconstruct) the Raid into a single image
- Use a data recovery tool to analyse and extract data.


Here's that in a bit more detail;

1.Connect the damaged drive to another PCs motherboard
In order to salvage the data from the damaged drives I needed to connect them to another PC. This proved a little trickier than i first imagined. I normally use a SATA/USB adapter to connect loose hard drives to my PC, however, for some reason I couldn't get it to mount (i even tried in Ubuntu as it has a utility called dd to image HDDs). Eventually I attached the HDD to my PCs motherboard. Easiest way to do this is to take the SATA connector from your DVD drive and plug it into the hdd, this will save you having to open your BIOS to add a new drive. You will also need power, so find a spare power line and hook it up.

Once your connected, boot to Windows. You won't see the drive in explorer, but if you open "Disk Management" utility in "Computer Management" you'll see it.


2. Get an image of the affected drives.
For this i used a free tool called ReclaiMe. It's very easy to use and comes with a lot of instructions which i wont go into here. But whatever too you use at the end you should have a .img file sitting on you hard drive that is a complete copy of the drive you want to recover.

Repeat steps 1 and 2 for the second raid drive.

And yes, you guessed it. You'll need a lot of free space on your PC to get these images.

3. Destripe (Reconstruct) the RAID into a single drive image.
Once you have both drives imaged, you must then merge them together to get a single image. This is effectively making the drive not RAID any more in the hope that basic file recovery tools will be able to analyse the image. Again i used the free tool ReclaiMe. There is an option to open image files and reconstruct them... which is nice. This takes hours.... and hours. But worth it in the end if you get your data back.

4. Use a data recovery tool to analyse and extract data
Once the drive is imaged, your at the stage of using a file recovery tool to get your data back. I use GetDataBackForNtfs. There is a free version of the tool that will do the analysis to see if you can retrieve the data so i recommend that you do this first.
Once you have followed their instructions on using the tool, you should be able to rebuild the index (can take 6-8 hours) and eventually copy the files found from the image to another backup drive.

I hope this post helps. Let me know if you come across any difficulties or a better way of doing it.


Wednesday, February 8, 2012

Those darn svchost.exe processes

So long have I just taken for granted that what resides in them is a mystery... no more.

In case you don't know, the svchost is an executable container used to launch services that are now implemented as dynamic link libraries (.dll). This is all well and good until you get virus paronia and want to know what they happen to be running inside that generic veneer. 

Well it's easy to find out.
1. Open command prompt and enter
2. tasklist /SVC 
3. All the running process are displayed, including any dependent services


Easy when you know how i suppose

... you could also use SysInternals ProcessMonitor available here, but the latter means you don't need to install any extra tools.