GreasedGradient

Some tidbits while tinkering.

New iOS 7.1 location changes

One of the best things about iOS 7.1 is the new changes for location changes in the background even after closing an app. I have an app idea that needs to be running with iBeacon, but previous updates were too slow and unreliable.
This also works the same way with geofencing, however I can’t see an effect with IFTTT at the moment.
Apple really should inform about these sorts of changes though, as the effect on privacy with geofencing apps is probably something that should be considered.
Will wait and see about the effect on battery life too though.

LAUSD and Cheapo MDM

I’m actually a big fan of using Apple software like Configurator and Profile Manager to do MDM on the cheap – and using ActiveSync profiles means that most people will be able to secure the information on the device, which is what most companies need.

But the particular needs of the LA Unified School District don’t really fit neatly in here.

Thanks to a whole range of news articles (the best one, I think is from Ars Technica), the LAUSD has now found out that users can easily delete their ActiveSync profile to return to an ‘unmanaged’ state.  I’m seriously amazed they could get this far down this path, with a $1B project, likely lots of support from Apple directly, and not realise that profiles can be deleted?

Apple has, in effect, always gone for a model where a device could, one way or another, be put back to a ‘virgin’ state. Even if this particular method wasn’t used, there are plenty of others to restore an iPad, with the only minor issues being getting access to specific enterprise apps.  It’s not rocket science.

The key is whether you secure your data at the app level, and whether you provide enough benefit that your users continue to elect to be under management.  In business, this is easy (lost devices can be claimed via insurance, lost data is harder), and for home users, parents could just take away the device.  But for education, where programs will rely on access to these iPads, it has a unique need to securely lock down the device entirely, but being a school, doesn’t have the budget for good MDM to help.

One of the things I’m not aware of yet it any features around device management, particularly ‘supervised’ mode in the new versions of Apple Server iOS management tools and iOS that will prevent this.  We have the functionality to, with Activation Lock, as well as supervised modes, but have not seen or tested the ability to lock down the ability to remove profiles in iOS7.  Though, with this sort of news hitting Apple in education (though not really their fault) one could imagine this being a feature introduced in iOS7.1, if it’s not in already…

iPhone5s fingerprint reader

While initially very excited about the potential involved in the iPhone5S fingerprint reader (I’m particularly hopeful for increased locking of handsets) I’ve grown sour on the expansion of the fingerprint beyond the initial lock screen, and perhaps some specific Apple usage.

Why?

Because while Apple is incredibly careful to say no-one can access your fingerprint, articles such as this by Wired means that if the API is opened up, you will lose the ability to maintain that certain actions taken on the phone were not taken by you. While this doesn’t worry me too much personally, the real issue for me, is a little deeper.

If the API is open, what is the restriction on applications, while not actually getting access to the fingerprint, gaining access to the knowledge of the fingerprint ID, and collecting additional information, matching that ID on the phone to the fingerprint activity?  In fact, this is natural and would be absolutely required for the apps to do personalisation.  However, if this ID is common between apps, and the OS itself, the usage of this fingerprint with SSO would allow people to profile that individual user to a fingerprint.

What’s the difference with SSO you ask?  Rejection.  Or the ability to physically map these items to someone else under duress.  Later, when people map their fingerprints to passwords, their internet banking etc. the incidence of robbing resulting in the attacker taking your entire fortune away will increase significantly.

So two main issues, the profiling with very strong physical user mapping, and users with SSO over-using the convenience factor are what worries me if they open the API.

Two main solutions:

  1. If the API is opened, apps should never be able to map an ID to a user.  This means the fingerprint sensor (and SSO API) should profile per-app IDs completely random.  The side-channel aspects of this will be hard to contain, such as apps performing per-ID mapping based on real-time usage reported to a server.  But completely random ID’s should make this essentially the same issue as now.
  2. Rejecting App usage of fingerprint should be an OS level function, also resetting the ID used by a particular fingerprint.
  3. Requiring apps to use fingerprint + password, for things like banking apps, should be an API-supported function.  As should things like a duress password.

Managing all of this in a typically Apple “simple, magic” way with all of a sudden the worlds most popular fingerprint reader is going to be a monumental task.  And one that will take a great deal of care.

Kudos to Apple for not taking this as an early launch feature and harming peoples security.  One of the only issues I have is that it is likely some in the Android camp won’t take a measured view on this, and might mess up the party for everyone…

iOS7 is out

Really, really excited at the possibilities in iOS7 with increased focus on security, and elements like trust for the computer you connect to, and notifications of additional users on iMessage.  It is fantastic that Apple is becoming more mature in its look at security, but a pity that issues like lock screen bypass are back…

If you’re keen on security, never go a new iOS version until a few of the bugs are ironed out…

MDM vs API and Application Level controls

This article at Ars technica had a phrase that really gets to one of my primary issues with MDM (Mobile Device Management).  It was “Most people believe that mobile device management—the idea that I can ask an employee to hand over his new personal phone, root it, hand it back to them—is not viable,” from Michael Mullany, CEO of Sencha.

For some reason, the security field still believes it is appropriate! Even though I know the concept, and am in security, the idea that I trust my enterprise to have root control of my device is still alien, and a massive turn off for BYOD.

With de-perimeterised networks, with federated identities, with API’s causing unprecedented levels of interaction, and typically poor security models applied at the application layer, direct control of the mobile device is archaic, and still rooted in the ‘if the CEO loses his iPhone, someone can access all his email’ type thinking.

One of the great aspects of the article is the brief discussion of API-level governance.

The reason we (as an industry of security professionals) look at MDM is because dealing with developers and making up an application security model is simply too hard for many, precious few of which program, and even fewer know the benefits of things like Agile and new API interaction models. But much like many security professionals (including me) missing the boat around iPhone adoption – this model is already in.  It’s not the future, it’s the now, and we absolutely need to embrace it.

The focus on BYOD and device-level sandboxing is useful, but ultimately only a tool to accentuate the new model of security – that of defining, and implementing security at the per-app basis, while sharing authentication and authorisation duties around.  This will require a massive change in thought processes, and a modelling of user/role and data access that in my experience hasn’t been done outside of academia.  It’s made all the more complex with the unknown sets of interaction possible, and may well require data itself containing tokens on the usability – such as that the photo your share with your friends, can’t be shared publicly by them, which will require a new set of common, standard protocols to be developed. Hopefully, these can be as useful as the original OAuth standards, rather than the less successful ones, like OAuth 2.0 :)

 

 

 

DR Planning vs DR Testing

Recently I’ve been involved in a lot of DR work, including reviewing best practice, and seeing the process of the business in implementing DR Plans.

I’ve come to the realisation, much like the movement towards Agile in development, that DR, at least in the IT realm is out of date, and certainly the emphasis required from standard such as ISO 27001 et. al. is misplaced.

In practice, many IT systems are built with a range of different fallback mechanisms, and in a great deal of cases, IT have documented, or generally know how to, implement fallover to a seperate system
So Disaster Recovery Planning or Business Continuity Planning will pretty much always look good, at least to a general muster – unless your IT area is just too busy to do it, or very immature.

But things like flooding taking out most of the CBD can make DR plans in an environment like Manila pretty useless. How to do DR if staff can’t make it to the DR site, or it’s flooded too, and the flooding
is so bad you couldn’t run from home, or their homes are flooded too? One of the truisms of any DR planning is its the unplanned things that get you – but how many people were requested to do Avian Flu DR plans for a scenario that was very, very unlikely?

In the end, it’s your ‘Black Swans’, or cascading failures that will have the biggest impact – and most organisations simply don’t have the resources to do any sort of actual prevention of these items, bcause they are too rare, and generally too costly, to remediate, if you can at all!

In addition, DR Planning is done, a lot of the time, by external consultants or centralised personnel without much deep knowledge of the systems involved. As such, DR Planning, like most planning, should be considered just like a ‘best guess’ of what people think is most important at the time of planning, and the remediation steps. In many cases, it’s not even this much – very few plans I’ve seen know the money required, whether the business will have appropriate cash flow, the resources available or backup facilities/space to enact a proper DRP. In particular, issues with SAN, reconstructing from backups, and Data Centre space are big issues, not normally considered.

So what are the things that work to help you in DR?

1. Full backup environments, in a fully seperate location. Like development environments that can be repurposed.
2. Doing testing. A lot of plans are developed with very little real rigour, and only by going through scenarios can you legitimately tease out
how or why things will work.
3. Awareness amongst employees that DR needs to be considered, so new systems have it baked in, and manual workarounds are available.
4. Consideration if you have unique hardware or software, that may be difficult to obtain quickly – can you run without it?
5. How long will it take before enacting a full-scale DR solution is financially viable? This is a number you should have in your head, in conjunction with customer impacts and reputation damage impacts, to enable decision makers to act quickly.

What things break in practice?
1. Access to locations aren’t available out of hours.
2. Key systems or physical elements (laptops, keys, keycards etc.) are either lost, whereabouts not available, or make unavailable as part of the incident.
3. Failover systems don’t. Or even harder to detect, can’t fail back.
4. Firewalls and access control isn’t configured for the backup site/environment.
5. Agreements for space in DC’s (or space that is ‘reserved’) and backup hardware from vendors is much more expensive than planned if needed in a hurry.  In some cases, things like connecting fibre runs, additional power etc. can be extremely difficult to get done quickly, if you don’t have vendors willing to work with you.

So Why test?

IT has incredibly short lifecycles for infrastructure in general terms. But also, things like evacuation plans, access to external resources (like office space, data centre space)  change quite often, and in my experience, very rarely are people notified.

You may be lucky enough to have a full-time DR support person, who can keep track of everything, but very few organisations do. Typically this role is split amongst incident managers (if you have them) and line management, or perhaps delegated to an area like security. In any case, it isn’t that persons primary role, which increases the chance the reality of the environment differs significantly from the ideal of the plan.

But testing finds the flaws. In Agile terms, it provides the concrete benefits of DRP sooner, whereas planning is analogous to the ‘requirements’ stage in normal development. It takes a long time, you don’t have certainty about the result, and doesn’t provide timely feedback (most of the time). It allows you to identify, and immediately verify the solution, as well as point in time costs.  It also helps to weed out what the business says it does vs what it does, and where configuration changes have been implemented but not documented.  In some cases, it might provide feedback significant enough that you change the core system to be better prepared for a disaster, such as new environment, or additional manual workaround processes.

So start with a basic plan, but make it better with each test iteration. You’ll find you save time in planning, and gain benefit in results. Much like a real incident, it also helps give concrete results of why you’re doing DRP in the first place!

SAN – I’m not such a fan

Reading this story at AnandTech reminded me about why I’m not such a fan of SAN.

First off, I think it has uses. There are definite use cases for highly available, network available, high performance SAN installations, I just think there aren’t as many as SAN vendors would like you to think.
It’s an enticing option, the nirvana of expandable storage available across the network at lightning speed, with high reliability, a range of enterprise level features, things like de-duplication, virtualised storage, highly dynamic provisioning, simplified backup etc.

But in general, it’s just:

  1. Far too expensive to be worth it.
  2. Underperforming in real workloads.
  3. Expensive/hard to put in, expensive/hard to maintain, and forces you on an upgrade cycle that seems to be unfair to the consumer.

My general rule, from time in the industry is that its expensive to scale out in hardware, and cheap in software. SAN doesn’t obey this rule, and for an industry obsessed with horizontal scalability, SAN is a distinctly hierarchical model – it’s just that it lets you believe you’re scaling horizontally, when you’re not.

As they explain in the article, far better than I can, SAN has underperformed for a massively long time, and requires a huge expense, largely glossed over by the vendors, for maintenance, specialised skills to manage, specialised networks, specialised, custom (normally hard to get) hardware, and doesn’t necessarily provide significantly improved provisioning times over traditional storage.  It’s also, for me, a massive single-point-of-failure, from experience.  We had firmware issues with our SAN environment at one stage that took down whole sections of the data centre, and wiped out our backup arrangements because they were local, and on the same, misbehaving SAN.  Admittedly, there were a range of other issues that caused that scenario – but we were sold equipment expecting it to do, and behave, much better than it actually ended up.

Now Virtualisation is a special case in the SAN conundrum, and it’s no accident the biggest SAN vendor had the cash, and the reason, to snatch up the biggest virtualisation vendor. SAN provides a range of benefits for virtualisation, allowing increased flexibility, but how many people really use it? Vmotion is probably one of the biggest benefits, but underused in practice.

The real problem, as I see it, a lack of a cheap method of abstracting storage in software, while taking advantage of the new range of SSD’s available, to have the advantages that companies like facebook have in abstracting their storage, without paying the SAN price.

We need something like the Hadoop/CouchDB revolution in Big Data to happen in storage virtualisation.

The good news is companies like Fusion-IO working to bridge this gap.  I’m hoping after they forge their way, enterprising minds will work out how to do a passable attempt within open source, and we’ll finally have a storage layer that works. Until then, we need to keep waiting, but hoepfully, if we have enough control of the application itself, individual companies can build their apps to use cheap IOPS on local PCIe SSD’s, while implementing caching to external storage pools.

BYOD and MDM – why Bother?

Vito Forte with Fortescue Metal has recently commented about not overcomplicating mobility.

For me, the idea of BYOD, and the normally draconian rules applied if you purchase an MDM solution seem diametrically opposed.  The process seems to be:

  1. Allow BYOD because someone told you it would save money.
  2. Somebody freaks out about security, and lack of control.
  3. The solution is MDM!
  4. MDM is expensive.  And nobody wants to install it.
  5. And thus, some companies even go to the point of purchasing the devices that they were meant to get employees to buy!

MDM is honestly, a fantastic tool for enterprise fleets, in particular those with needs to roll out and manage enterprise software, or who have very strict requirements on what their mobile userbase do.

But it doesn’t meld well with BYOD.

Typically, most of what people want MDM to do (passwords on devices, and remote wipe – really, that’s the typical useful limit!) can be done via ActiveSync settings in Exchange.  Which almost everyone runs. Or something like Google Apps supporting the same functionality. You don’t get encryption, but it should soon be easily accessible on Android and iOS devices, even on a per-app basis, which is the essence of BYOD.

Then, all you need to do, is make sure you encrypt data in transit, even using the new iOS per-app VPN feature, and even better, use the new iOS SSO features (not yet publicly released) to make multiple app sign-in relatively painless.

You loose the ability to have enterprise software roll-outs automatically, but both Android and Apple seem to be recognising the difficulty in enterprise apps and BYOD, and working hard on solutions. Expect iOS 7 to have a number of improvements to this, and there are ways around it on Android already.

My point is, don’t treat mobile as something to over-regulate because you don’t understand it. The new security models and improved overall device constraints make the average mobile significantly more secure than the average desktop – even most policy-secure desktops. Embracing the new systems with a little thought will allow more flexible deployments, and keep the bean counters happy.

Auditing Your Service Provider

(adapted from a presentation I did at the 2011 Oceania CACS Conference)

As a customer, sometimes it’s useful to know the things you should be generally concerned about in terms of auditing, i.e. what are the normal areas to be concerned about, what should trigger red flags etc. – because the reality of auditing a service provider is that typically, they know you are coming, and you don’t have the ability to review material in the same way as a formal audit, or internal audit would.

1. SLA’s: Review these carefully.  In particular, pay attention to the exclusions, the areas of applicability, how they will be measured, and the penalties assigned. As mentioned in other posts, one of the key things about SLA’s is that a good many service providers don’t see them as terribly binding, and are very optimistic.

2. How exposed are you to failures within the Service Provider infrastructure? i.e., can your services still run if their AD server fails, are you depending on Single Points of Failure of the network infrastructure, are you reliant on their DNS systems, load balancers, mail routers etc, even if you haven’t specifically decided to use these services?  Are these services covered under your SLA?

3. Where and how is data hosted in the service? Are systems backed up, and do they have off-site secure backup?

4. What is the stability of the company? Are they acquisitive, are they ripe for takeover, are the services you are using core to the company? How is the share price, and recent advice to the market?

5. Do they have security policy, and is it compliant with ISO 27001? Do they have any evidence of operation of the policies you can vet? (note, many service providers may legitimately not provide you with logs or details for confidentiality reasons, but it helps to ask)

6. Data Centre – do they run their own, and if they do, what controls are in place around it, and can you inspect the site?  If a third-party provider, still worth seeing if you can site visit, but otherwise you can ask for the SLA with the DC provider, and make sure it’s reputable via point 4.

7. What are the notification arrangements in the event of a security, or other important incident?  Is this 24×7 (if you need it) and if so, are your staff ready to take that call?

8. Is the Service Provider dependent on any third-parties (typically at least network providers). If so, how does the flow-on SLA from that provider look, and are the arrangements in place to meet your SLA? Is the Service Provider liable if a third party doesn’t meet their SLA?  Do they have backups and alternate arrangements to support failure of third parties?

9. Make sure to review the proposed service in the vendor selection process, and ensure the requirements are relevant to the service being provided. Some companies have no choice due to regulation, but uninterested or overly protective internal security or internal audit teams can come in with stiff requirements at times, well beyond what is justified for the service.

10. Make sure that the delivered service is still up to scratch after the contracts have been signed, and service has been deployed! In a lot of cases, I’ve seen initial vendor selection criteria quite tough, for a service that dropped most of the important things (like IDS, firewalls, backups, DR environments) because they were too costly in the end.

DR and External Hosting

A neglected area of the hosting equation for quite a number of customers I have seen is the need for a Disaster Recovery (DR) or Business Continuity Plan (BCP) in the event the service at the provider goes down. This typically seems to be as the cost under normal hosting arrangements of a ‘warm’ (recent version of software, able to be ‘turned on’ with a short timeframe – normally needing full hardware backup) service is prohibitive, normally equally the cost of the original service.  ‘Hot’ services (full redundant, real-time replicated service in another location) are extremely expensive, even if used to house development resources – a common scenario to help ROI.

This leaves the customer quite vulnerable to outages, of any sort, to the service.  And in the event of a true disaster, relying on the Service Provider can leave you in what might be the end of a very long queue for restoration.  Lets go through a few examples of outages, and what many service providers will do:

  1. Network Outage: Service Provider Reaction?  Nothing really.  If they can’t keep the network going they have bigger problems than DR.  This happens a lot in single-DC providers, but as a lot of services are naturally in a single DC, this can easily affect a customer.  Anything from a single dodgy router (affecting others down the line, or forcing ‘flapping’ – continued switching between two routers), to Denial of Service, to DNS attacks (more common nowadays) to full ‘backhoe incident’ could be at play here. In most cases, you can only sit and wait, as without a DR environment, invariably bringing up the network quickest way to bring things online.
  2. SAN Outage: These are pretty dire.  Most Service Providers now rely heavily on SAN, which opens them up to wide-scale outages if the SAN has big problems.  And they do. It’s frightening how often they have to patch due to software bugs actually.  SANs are expensive, typically not over-provisioned (as this is even more expensive), and impossible to do quick replacements for.  This means most service providers, faced with a large outage, will just have to wait it out.  Unless you’re lucky enough to have Dr, or a server with locally attached storage, so will you.
  3. Internal Network Outage: Normally not a problem in most scenarios.  Network equipment is simpler, easier to troubleshoot and fix, and cheaper to replace, so most environments have backups.  That said, things like Spanning Tree issues can really mess up your day, leaving you with multi-hour outages.
  4. Server Outage: Once a fairly big issue, with the prevalence of virtualisation and good backups, this is normally a case of a multi-minute outage most of the time.  The biggest issue is if the service provider doesn’t have adequate monitoring, and can’t see there is a problem. Even with servers with locally attached storage, most Service Providers have spare equipment to bring up a server again in a few hours from a backup, unless you use custom hardware.
  5. Software/Service Issue: This happens quite a bit, with problems in software upgrades, incompatible versions, inadequate testing, or just platforms which are a bit ‘flaky’. In this case, troubleshooting is required, and a ‘warm’ backup, or ‘hot’ backup you can revert to a previous version is a really useful tool, and something that in most cases, can at least get you working again.

Thankfully, there are a few things you can do to make things easier.

  1. Check if you have easy access to backups if you need it.  If you don’t, seriously consider instituting your own, system-image based backup, and filesystem-based backup (might be the same, but a system image may be useless on incompatible hardware) so that if you need to, you can walk to another provider with the backup in hand.
  2. If you don’t have application level monitoring in place, invest in one of the many web-based providers who can monitor your site/service 24×7.  Many service providers are incapable of performing application level monitoring, and it’s not a good idea to have them do it anyway, as you want an external view of the service to make sure things like DNS are working appropriately.
  3. Most customers cannot afford a ‘warm’ or ‘hot’ backup solution, and access to backups can be problematic in a real outage scenario.  But moving development environments to a cloud provider can allow benefits such as continuous integration, as well as provide a ready-made environment for DR. Typically, customers will have a full replicated test environment in the cloud. While this may not allow the same level of performance in the event of true disaster, using these environments provides great test data, and a true DR solution, independent of your service provider.  If your provider is a cloud provider, consider a separate provider offering similar services, as we have seen cloud providers like AWS and Azure have multi-region outages lasting many hours in the past.
  4. Monitor your SLAs.  Most have no provision for ‘Force Majeure’, which means in a true DR scenario for the Service Provider, you are out of luck.  Look at real backup options to help in this situation.
  5. Investigate insurance arrangements.  Whilst not typically useful to handle losses due to outages (normally, these are pretty expensive), insurance can help you with the cost of equipment and migration to another service provider if needed.  This can allow the agonizing decision about when to trigger DR a lot easier to handle, and even if cashflow is a problem, short term loans could be obtained, or money fast-tracked if you’ve made prior arrangements.

I’ll go into more detail on specific techniques to aid in DR, and planning for DR and Business Continuity events in some later posts, as well as more detail on how to calculate all the elements of DR based on SLAs and availability metrics of your components, in later posts.

 

« Older posts