Subscribe For Free Updates!

We'll not spam mate! We promise.

Sunday, April 10, 2011

Taking control of the cloud

I have found this white paper about Clouds, Types of clouds, securing the cloud, I thought you may be interested to read it...

http://library.dzone.com/sites/all/files/whitepapers/Taking_Control_of_the_Cloud_0.pdf

Saturday, April 9, 2011

Regression vs. Retesting

Regression vs. Retesting

You must retest fixes to ensure that issues have been resolved before development can progress.

So, retesting is the act of repeating a test to verify that a found defect has been correctly fixed.

Regression testing on the other hand is the act of repeating other tests in 'parallel' areas to ensure that the applied fix or a change of code has not introduced other errors or unexpected behavior.

For example, if an error is detected in a particular file handling routine then it might be corrected

by a simple change of code. If that code, however, is utilised in a number of different places

throughout the software, the effects of such a change could be difficult to anticipate. What appears to be a minor detail could affect a separate module of code elsewhere in the program. A bug fix could in fact be introducing bugs elsewhere.

You would be surprised to learn how common this actually is. In empirical studies it has been estimated that up to 50% of bug fixes actually introduce additional errors in the code. Given this,

it's a wonder that any software project makes its delivery on time.

Better QA processes will reduce this ratio but will never eliminate it. Programmers risk

introducing casual errors every time they place their hands on the keyboard. An inadvertent slip of a key that replaces a full stop with a comma might not be detected for weeks but could have

serious repercussions.

Regression testing attempts to mitigate this problem by assessing the ‘area of impact’ affected by a change or a bug fix to see if it has unintended consequences. It verifies known good behavior after a change.

Hadoop


What is hadoop?

Open source software for reliable, scalable, and distributed computing.

Flexible infrastructure for large scale computation and data processing on a network of commodity hardware.

The Linux of distributed processing.

Why hadoop?

Very large distributed file system.

The data is distributed across data nodes .

Reliability and availability.

Files are replicated to handle hardware failure.

Detects failures and recovers from them.

Ability to run on cheap hardware.

Open source flexibility.

Runs on heterogeneous OS.

Scalability.

The number of nodes in a cluster is not constant.

Parallel processing through MapReduce.

Main components:

HDFS(Hadoop file system) for storing.

The Map-Reduce programming model for processing.

Hadoop Distributed File System (HDFS)

A distributed file system based on GFS, as its shared filesystem.

Distributed across data servers.

Data files partitioned into large chunks (64MB), replicated on multiple data nodes.

NameNode stores metadata information (block locations, directory structure).

Map-Reduce:

Framework for distributed processing of large data sets.

Amazon DB

Amazon SimpleDB
Amazon SimpleDB is a highly available, scalable, and flexible non-relational data store that offloads the work of database administration. Developers simply store and query data items via web services requests, and Amazon SimpleDB does the rest.
Behind the scenes, Amazon SimpleDB creates and manages multiple geographically distributed replicas of your data automatically to enable high availability and data durability. The service responds to changes in traffic by charging you only for the compute and storage resources actually consumed in serving your requests. You can change your data model on the fly, and data is automatically indexed for you. With Amazon SimpleDB, you can focus on application development without worrying about infrastructure provisioning, high availability, software maintenance, schema and index management, or performance tuning.
Amazon SimpleDB Functionality:
Build your data set
Choose a Region for your Domain(s) to optimize for latency, minimize costs, or address regulatory requirements. Amazon SimpleDB is currently available in the US-East (Northern Virginia), US-West (Northern California), and EU (Ireland) Regions.
Use CreateDomain, DeleteDomain, ListDomains, DomainMetadata to create and manage query domains
Use Put, Batch Put, & Delete to create and manage the data set within each query domain
Retrieve your data
Use GetAttributes to retrieve a specific item
Use Select to query your data set for items that meet specified criteria
Pay only for the resources that you consume
Amazon SimpleDB Pricing:
As your demand grows, you still pay only for what you use. As with other AWS services, there is no minimum fee and no long-term commitment. Also, note that we charge less where our costs are less, thus some prices vary across Geographic Regions. The prices listed are based on the Region in which you establish your Amazon SimpleDB domain(s). Amazon SimpleDB may be used from most countries, so long as payment is made in US Dollars.
Amazon SimpleDB measures the machine utilization of each request and charges based on the amount of machine capacity used to complete the particular request (SELECT, GET, PUT, etc.), normalized to the hourly capacity of a circa 2007 1.7 GHz Xeon processor.
Machine Utilization.
Data Transfer.
Structured Data Storage.
Amazon Relational Database Service (RDS):
Amazon Relational Database Service (Amazon RDS) is a web service that makes it easy to set up, operate, and scale a relational database in the cloud. It provides cost-efficient and resizable capacity while managing time-consuming database administration tasks, freeing you up to focus on your applications and business.
Amazon RDS gives you access to the full capabilities of a familiar MySQL database. This means the code, applications, and tools you already use today with your existing MySQL databases work seamlessly with Amazon RDS. Amazon RDS automatically patches the database software and backs up your database, storing the backups for a user-defined retention period.
You also benefit from the flexibility of being able to scale the compute resources or storage capacity associated with your relational database instance via a single API call.
As with all Amazon Web Services, there are no up-front investments required, and you pay only for the resources you use.
RDS Functionality:
Launch a database instance (DB Instance), selecting the DB Instance class and storage capacity that best meets your needs.
Select the desired retention period (in number of days) for your automated database backups. Amazon RDS will automatically back up your database during your predefined backup window. For typical workloads, this allows you to restore to any point in time within your retention period, up to the last five minutes. You can also restore from a DB Snapshot, a user-initiated backup that can be run at any time with a simple API call.
Connect to your DB Instance using your favorite database tool or programming language. Since you have direct access to a full-featured MySQL database, any tool designed for the MySQL engine will work unmodified with Amazon RDS.
Monitor the compute and storage resource utilization of your DB Instance, for no additional charge, via Amazon CloudWatch. If at any point you need additional capacity, you can scale the compute and storage resources associated with your DB Instance with a simple API call.
Pay only for the resources you actually consume, based on your DB Instance hours consumed, database storage, backup storage, and data transfer
RDS Pricing:
Pay only for what you use. There is no minimum fee. Estimate your monthly bill using the AWS Simple Monthly Calculator.
Amazon RDS DB Instance Pricing
Provisioned Database Storage
Backup Storage
Data Transfer
Availability Zone Data Transfer