PureStorage

Soldato
Joined
25 Nov 2004
Posts
3,792
http://www.purestorage.com/

Anyone seen or heard of these guys? I just came back from a sales pitch/technology introduction and WOW. Basically, it's a pure flash SAN. With that comes all the benefits you would expect in terms of I/O, IOPS as well as all the drawbacks you would expect, like cost.

They have done some awesome things to reduce the TCO such as in-line de-dupe and compression before the data hits the SSD which obviously lowers writes and increases the life of flash media as well as obviously giving you more bang for your buck as 6TB RAW storage becomse 12-24TB+ depending on what you store on it.

Anyway this was the first time I had heard of them and I am very impressed. Needless to say we are investing in an array and I am keen to get my grubby little mitts on it :D
 
Soldato
OP
Joined
25 Nov 2004
Posts
3,792
I don't have sales figures as I was purely there as a technical resource. I had not heard of Nimble until now but a quick Google reveals they spurt much of the same stuff as PureStorage. I cannot tell you in-depth what the differences are, if any as Pure is the only company I have attended a sit down with.

Both seem to say they are the first with everything, who is telling the truth I don't know.

As for the MLC concern, the way Pure described it is that they are selling their propriety software (Known as Purity, which sits on top of a hardened Ubuntu installation) and that sits on top of consumer grade hardware. Their controllers are Dell servers, their disks are Samsung MLC and they have two SLC disks per disk chassis which were branded Zeus when I had a play and unplugged them from the array. These are NVRAM modules which are responsible for ensuring data is written and verified to the array before being cleared. Due to this, it doesn't matter if a drive fails as when it is replaced it just rebuilds data from the others,, but instead of taking a day to rebuild a large array, it takes 10mins. They challenged us to break their array, we unplugged 6 drives including 2 of the NVRAM drives, pulled the power on one of the controllers and various other things. The array just kept running. I was impressed but then I haven't been around loads of SAN's, only really an EVA and P4800.

Oh, and in the 36 months the company has been shipping units they have had 5 drive failures in total. Another figure I was impressed with. Only because we have had 5+ drive failures on our P4800's in a year.

They also have a "let us put a system in your data centre and if it doesn't perform to your liking, send it back" policy. So far no one has sent one back.
 
Last edited:
Caporegime
Joined
18 Oct 2002
Posts
26,053
I think the days of the storage vendor are limited anyway, so people are just trying to cash out while they can. The era of buying a compute node and a separate storage node for your virtual environment is as good as over.
 
Soldato
OP
Joined
25 Nov 2004
Posts
3,792
people are just trying to cash out while they can.

"Cashing out" in the traditional sense doesn't make sense for Pure as they are a brand new company. They have no previous IP to protect or make money from. They started from the ground up with a pure flash "no disk" ethos.

Unless in the future they go down the road of selling their Purity software and allowing you to put it on any hardware you feel like, that is an avenue I guess that would gain popularity but then they would just open themselves up to a world of pain in terms of hardware support.
 

Ev0

Ev0

Soldato
Joined
18 Oct 2002
Posts
14,152
They have no previous IP to protect or make money from.

They've had to buy it all (there was a recent article about them buying a load of patents off companies), and are also currently in a legal battle with EMC over some IP infringements ;)
 
Last edited:
Soldato
OP
Joined
25 Nov 2004
Posts
3,792
Well for anyone who cares, got some pricing on the kit. The Pure 11TB offering came in at £178K. Similar IBM offering was slightly more at £185K and HP/3PAR was £330K :o

I have contacted Nimble to discuss what they can offer. The more vendors the better!
 
Soldato
OP
Joined
25 Nov 2004
Posts
3,792
Why do you need pure flash? Unless you have crazy IO requirements, it's just willy waving.

We have an archaic database application that we "have to keep alive no matter what" and the only thing we can do is throw tin at it. We have tried to put together a proposal to develop a new platform but it was denied. The funny thing is the cost of the tin is probably going to cost as much as the development of a new system but what the board decide is out of our control.
 
Associate
Joined
4 Dec 2002
Posts
316
Location
Chelmsford
I think the days of the storage vendor are limited anyway, so people are just trying to cash out while they can. The era of buying a compute node and a separate storage node for your virtual environment is as good as over.

As I am being dragged into the virtual environment (Coming from a networking environment), what is the alternative to separate Compute and Storage nodes?
 
Associate
Joined
4 Aug 2008
Posts
1,778
Location
Waterlooville
Somewhat concerned how this works...

So you perform inline dedupe then when your db calls up the data which has been dedupded surely there is huge io latency, while you wait for the storage processor to look up the data required. In my mind the dedupe is where it falls down. -obviously same applies for compression.

Somewhat surprised that a device as expensive as this does not do smart detection and fail the drive ahead of time so you do suffer rebuild times even at 10minutes... which will heavily impact any sizeable disk array.

Would really recommend a tiered enviroment over this... since it doesnt sound like you are truely io intesive...
 
Soldato
OP
Joined
25 Nov 2004
Posts
3,792
You are thinking with how disk works, not how flash works. I don't have all of the answers but I saw it in action, I saw performance graphs over time under load and there is no IO hit. Don't you think if there was it would be a major flaw in their business model and anyone using the array would see right through it? The fact is probably that the array is capable of 300K IOps but they get a 50K IOps hit by inline dedupe/compression so they quote 250K IOps for this reason. This is speculation but that is how I see it working anyway.

As for proactive monitoring that is exactly what they do. Their array talks back to headquarters and they monitor all sorts of metrics and will tune your array based on the performance data to get the most out of it as well as detecting failures ahead of time.
 

Deleted member 138126

D

Deleted member 138126

Somewhat concerned how this works...

So you perform inline dedupe then when your db calls up the data which has been dedupded surely there is huge io latency, while you wait for the storage processor to look up the data required. In my mind the dedupe is where it falls down. -obviously same applies for compression.

Somewhat surprised that a device as expensive as this does not do smart detection and fail the drive ahead of time so you do suffer rebuild times even at 10minutes... which will heavily impact any sizeable disk array.

Would really recommend a tiered enviroment over this... since it doesnt sound like you are truely io intesive...
Couple of points:

- Dedupe has zero impact on read performance. A deduped block simply becomes a pointer to the master block (the original block that was written), and this pointer is held in the master lookup table (which will be 100% in RAM), so zero performance hit on the read. Write performance may be impacted depending on how they do the dedupe, but I'm guessing the write itself is cached (in RAM), so from the point of view of the writer, there should be no impact.

- My experience with SSDs is that they fail catastrophically, they don't seem to degrade over time. And pre-failing a disk still requires a rebuild, so how is it any better than just waiting for the disk to fail, and then rebuilding at that point? Mind you, pre-failing may be better from a pure housekeeping point of view, but the end result is the same: a rebuild which may or may not have a performance impact.

- Don't outright disagree with the tiered solution point. However, tiered systems introduce an additional level of unpredictability, e.g. the DBAs complain about performance on one of their SQL Servers, and by the time you've spent a few days collecting performance logs, the problem magically goes away (because the data got promoted to a faster tier). The same can happen in reverse, where something that had never been a problem starts having performance issues because the data got tiered down. So I'm not 100% convinced that it's a great solution. Yes, you get theoretically much better utilisation of expensive storage, while being able to effectively use a lot more cheap storage. But this comes at the cost of predictability, which is no small thing in Tier 0/Tier 1 systems.
 
Associate
Joined
4 Aug 2008
Posts
1,778
Location
Waterlooville
Ahh ok, that makes more sense.

50k iop hit is the overhead I was talking about, the way they choose to describe it to you is not overly important, de-dupe adds performance impact so that's interesting.

Obviously a flash array isn't going to perform worse than the same number of mech spindles. - as for business model, should have been in a recent complement meeting I had with Dell...

If the array does Smart detection and early rebuild to hot spare thats good, reading your OP it didn't seem like that was the case.

Will be interesting to see how this pans out. - lots of new products hitting the market, it can always be abit of a stab into the dark when moving to a new vendor
 
Last edited:
Associate
Joined
4 Aug 2008
Posts
1,778
Location
Waterlooville
Couple of points:

Thanks for your input.

Noted on dedupe that's of worth note, but still obviously has overhead.

Pre-failure is significantly better, obviously only in scenarios when it apples. For instance if you are writing more data per day than the SSD is certified for the unit will see this and adjust its life time and once its coming to the end of life while data is being read from the failing disk the new data which is being written in the drives can be written onto the hotspare and once the new data is all on the new drive and the old disk has no valid data it can be marked as failed, which allows for effectively no rebuild time. Of course catastrophic failure is still catastrophic.

As for tiered since we dont know the environment as you say there are allot of considerations.
 

Deleted member 138126

D

Deleted member 138126

Noted on dedupe that's of worth note, but still obviously has overhead.
If the write is cached, I can't see how there'd be any performance impact. From the point of the view of the writer, the block has been written, and it can move on. In the background, the array controller calculates the checksum of the new block, and then checks the lookup table to see if an identical block already exists. If no, it commits the new block to disk; if yes, it updates the lookup table with a pointer to the existing block, and doesn't commit the new block to disk.
 
Soldato
Joined
18 Oct 2002
Posts
9,505
BUMP

Been working for Pure for a while now but only just thought to search this forum.

let me know if I can answer any questions :)
 
Back
Top Bottom