[OTDev] OpenAM performance

Tue Jul 5 13:29:44 CEST 2011

Hi Surajit,

Thanks a lot for your comments. I've spent some time sharing more
thoughts on them (see below).

On 5 July 2011 07:03, surajit ray <mr.surajit.ray at gmail.com> wrote:
> Hi,
>
> Just my $0.02 ....
>
> I have brought up in the past the issue of tokens evaporating before a task
> is completed and data uploaded. I think its a serious question which as yet
> has no answer within the A&A Framework.

A token maps to a session with associated data, maintained by the
AA(A) server(s) (any AAA server, not necessarily OpenAM). If you have
millions of users with long lasting sessions (days or weeks) you'll
definitely have to run an Amazon-like datacenter just for maintaining
these millions of sessions alive.

> Also Nina in the reply has said that
> this [OpenAM] is a temporary solution rather than a completely thought out
> one.

Well, things can be always improved. However I wouldn't say that the
solution wasn't thought out one. One of the factors that played an
important role when selecting this particular solution was that we
needed to have something designed, implemented, installed, tested and
running in a matter of months and in the competence/budget constraints
of OpenTox. We succeeded in this (at least to some extent), but we
could do perhaps much better, provided the right conditions :-)

> I feel investing in technology [hardware wise] which is yet to serve
> our purposes completely is not necessary and we could rather do with a
> solution which is less costly and hence less of a loss in case we decide to
> change the A&A framework later.

In many cases the less costly solutions in fact prove to have a higher
price tag in the long run. Moreover, when you pay for a cloud service,
you also pay for the underlying hardware indirectly, don't you? :-)
You also pay for all the other stuff (installation, maintenance,
hosting, cooling, electricity supply, network access, etc). In
addition, you pay for basic administration of this hardware (up to the
level of the virtual service you're using). If at the end this proves
to cost less than running your own hardware, this could be due in part
to the fact that you're sharing this hardware and services with many
others, which helps reducing the price tag, but also makes you
dependent on them (especially performance-wise) to some extent.

> As to OpenAM itself, it is no longer supported by Sun and maintained by an
> independent Norwegian firm, which as we have now seen, has managed to mess
> up the latest release.

I'm not sure what you mean here. The latest release has many
improvements in a lot of aspects. The upgrade-related bug is no a show
stopper, even for OpenTox. It only means that you have to start from
scratch a new setup and re-import (or re-create) all the existing
policies in it (as I did in our test setup).

> Businesses with confidential data is not gonna be
> inspired by this technology let alone trust it with their IP. Moreover our
> complex policy systems although perfectly reasonable to us, is a mystery
> area to a new user. And something which is not easily understood will not be
> easily trusted.

Well, this applies even to a greater extent to any new AAA system you
might come up with, having in mind that distributed restful security
is an open research topic currently :-)

> Also I am amused by the anti-cloud sentiment here. There are vast [and
> secure systems] running in the cloud at this moment [ on Google, Rackspace
> as well as Amazon and many others].

There are also vast security breaches in these systems, that
occasionally pop in the news. Of course, an in-house system could also
have security issues, but if managed properly and all other things
being equal, there are less opportunities for such breaches happening
than in a system that is not completely under your control. This is
simply a matter of trust, which is an important component of security.
If you trust the Amazon guys more than your own abilities to build and
maintain a secure system, than flying in the clouds could be a nice
experience for you. However, this could also have some something in
common with religious beliefs and the associated spiritual comfort...

BTW, many Amazon customers experienced a quite long lasting service
outage recently, which was covered to some extent in the news. In an
attempt to fix the lack of proper problem reporting to customers,
Amazon revealed subsequently quite a lot of details on the reasons for
this outage. Interestingly, these details highlight quite a number of
flaws in their framework, which lead to this outage. If you're
prepared to accept more than 72 hours service outage and partial or
complete data loss (as experienced by a lot of Amazon customers) just
because you could blame somebody else for this, not your own
capabilities for delivering the service in the first place, then
again, the cloud service could be a reasonable choice.

Such situations are not specific to Amazon -- all the other cloud
service providers are also affected by similar issues quite often.

Last but not least, if you look at the cloud services from historical
perspective -- they're not something new, but rather a re-invented
wheel that has been re-branded. During the last few decades this
service delivery model had several ups and downs and the fact that it
is so popular today doesn't tell you anything about what would be
popular in...say 2020. My personal guess is that what comes next is
running full-fledged servers on mobile devices (they already surpass
the computing power of supercomputers from the 90s). Perhaps the only
issue that remains to be solved better is power supply/consumption. If
my guess proves to be right, then the next centralised-distributed
yo-yo run will either make the cloud useless or reduce its role to
simple bulk data storage.

> The cloud computer is just a machine
> and although it is virtual the same security mechanisms apply.

The paramount deference here is that the virtual machine adds an
additional layer which can have security issues, is not designed and
maintained by you and has a performance impact (a lot more
instructions have to be executed for getting the same service).

> Data can be
> encrypted and stored and you can always have a workflow where the data never
> sits unencrypted within the cloud machine. In fact the cloud machine [at
> least on Amazon] can connect to an external machine on a VPN, so the policy
> data itself can be stored outside with only the operations running on the
> cloud.

Yes, but where do you store you private encryption tokens? Who has
physical access to them? :-)

> And lastly I do not agree with Luchesar about physical security. Although a
> nice big computer room with security guards and canines outside conjure up a
> rosy picture of a secure environment, the biggest problems and leaks don't
> happen  due to people getting physical access to the machine.

This is simply not true, even naive. Again it is a matter of trust
rather than anything else.

> Rather
> carelessly designed code will be the bigger culprit in the long run.

I agree here. But what ensures that the virtual infrastructure you're
using doesn't have a single security issue? I'm convinced that it has
a lot of them and they just add-up to the pile of such issues, which
might be partly due to your own code. However, you have access to your
own code and can test and fix it, which is not true for the code of
the additional layers beneath yours and which you trust 100%...

> Infact
> having it on the cloud ensures that no one get physical access.

Well, a lot of people have physical access -- all the maintenance
personnel for instance. You're right though that YOU don't have
access, which doesn't help ensure physical security in any way :-)

> Amazon also operates something called a cloudfront , which is essentially
> virtual machines with low latency to a particular geographical area. Which
> is ideal for our situation. Also Amazon's load balancing is automated and at
> the least supports a private network within the cloud which would support
> perfectly our load balancing strategies.

This sounds nice indeed.

> The biggest advantage of the cloud system of course is instant scale up. You
> just have shut sown an instance, increase its memory and CPU and bring it
> back up again. I think that in itself will solve half of our problems in the
> most cost effective manner.  You can see our code running on the cloud at :
> http://50.19.222.138:8080/MaxtoxMCSS
> The static IP is also being provided free by Amazon.

Well, you're encouraged to spend some time and try installing and
testing OpenTox' AA infrastructure on Amazon, just as I did in the
last few days on our own server. It would be interesting to see
whether it will run, how well it will run, to what number of policies
it would scale up, etc (currently we have already 40000 policies in
our setup and the service exhibits a considerably lower latency than
OT's production instance, which has only ~13000 policies defined and
... happens to run on a virtual machine "in the cloud"). The results
you would obtain could serve as input to the framework evaluation task
of WP1 (deliverable due by the end of August).

I think that there might be some merit in the approaches you're
advocating -- you just have to prove this by conducting an experiment,
as I did in fact :-) Otherwise we could end-up just theorising.

> On the whole I think having a central policy server is a serious security
> hassle. We don't want all the client data compromised for the case of the
> failure of the central policy server.

In fact it is quite the opposite -- security can be looked after and
enforced better when you have a lower number of things to check and
control, all other things being equal.

> A distributed system as I had
> suggested earlier is more desirable and easily designed.

I like the idea of adopting a completely distributed AAA system and
have been advocating this approach since the very beginning of OT (not
necessarily for security reasons, but rather for scalability and
resilience). However, the review of existing technologies that we
performed two years ago revealed that there was no any suitable
complete solution for distributed restful AAA at that time. We could
have opted for designing and implementing a new one from scratch, but
decided to avoid this, since it would have required a lot more
resources than we had allocated in the OT description of work and also
a different focus of partner competences.

> Such a system would
> have specific policy servers for certain geographical or logical groups. If
> one is hacked the other groups are still secure.

As always -- the devil is in the details. If you start thinking of
them, you might end up with a different opinion. However I do agree
that having such a system would be nice.

> Also OpenAM has serious inconsistencies within it like numerous names for
> the same variable etc , which as Andreas has said in the past is not
> possible to solve without hacking the code.

This is just a simple cosmetic issue, compared to what you would be
confronted with when designing an AAA system from scratch and,
moreover, proving its soundness to the community.

> A&A should also play a role in
> collected and disposing unused resources, which right now is not a
> possibility within OpenAM.

Well, I agree that such functionality would be desirable. I don't know
if it can or cannot be done with OpenAM -- how do you know it cannot?
Have you checked?

I also think that an important aspect of AAA is the third A -- that is
to say -- Accounting. This is definitely something nice to have (e.g.
think of quotas for instance). Again -- we have to check what are our
options for supporting this.

Kind regards,
Vedrin