Technology Archive

How Machine Learning Can Transform Content Management- Part II

In a previous post, I highlighted how machine learning can be applied to content management. In that article, I described how content analysis could identify social, emotional and language tone of an article. This informs a content author further on his writing style, and helps the author understand which style resonates most with his readers.

At the last AWS re:Invent conference, Amazon announced a new set of Artificial Intelligence Services, including natural language understanding (NLU), automatic speech recognition (ASR), visual search, image recognition and text-to-speech (TTS).

In previous posts, I discussed how the Amazon visual search and image recognition services could benefit digital asset management. This post will highlight the use of some of the other services in a content management scenario.

Text to Speech

From a content perspective, text to speech is a potential interesting addition to a content management system. One reason could be to assist visitors with vision problems. Screen readers can already perform many of these tasks, but leveraging the text to speech service provides publishers more control over the quality and tone of the speech.

Another use case is around story telling. One of the current trends is to create rich immersive stories, with extensive photography and visuals. This article from the New York Times describing the climb of El Capitan in Yosemite is a good example. With the new text to speech functionality, the article can easily be transformed into a self playing presentation with voice without any additional manual efforts, such as recruiting voice talent and the recording process.

Amazon Polly

Amazon Polly (the Amazon AI text to speech service) turns text into lifelike speech. It uses advanced deep learning technologies to synthesize speech that sounds like a human voice. Polly includes 47 lifelike voices that spread across 24 languages. Polly is service that can be used in real-time scenarios. It can used to retrieve a standard audio file (such as MP3) that can be stored and used at a later point. The lack of restrictions on storage and reuse of voice output makes it a great option for use in a content management system.

Polly and Adobe AEM Content Fragments

In our proof of concept, the Amazon AI text to speech service was applied to Content Fragments in Adobe AEM.

Content Fragment

As shown above, Content Fragments allow you to create channel-neutral content with (possibly channel-specific) variations. You can then use these fragments when creating pages. When creating a page, the content fragment can be broken up into separate paragraphs, and additional assets can be added at each paragraph break.

After creating a Content Fragment, the solution passes the content to Amazon Polly for retrieving the audio fragments with spoken content. It will generate a few files, one for the complete content fragment (main.mp3), and a set of files broken up by paragraph (main_1.mp3, main_2.mp3, main_3.mp3). The audio files are stored as associated content for the master Content Fragment.

Results

When authoring a page that uses this Content Fragment, the audio fragments are visible in the sidebar, and can be added to the page if needed. With this capability in place, developing a custom story telling AEM component to support scenarios like the New York Times article becomes relatively simple.

If you want to explore the code, it is posted on Github as part of the https://github.com/razorfish/contentintelligence repository.

Conclusion

This post highlighted how an old problem can now be addressed in a new way. The text to speech service will make creating more immersive experiences easier and more cost effective. Amazon Polly’s support for 24 languages opens up new possibilities as well. Besides the 2 examples mentioned in this post, it could also support scenarios such as interactive kiosks, museum tour guides and other IOT, multichannel or mobile experiences.

With these turnkey machine-learning services in place, creative and innovative thinking is critical to successfully solve challenges in new ways.

written by: Martin Jacobs (GVP, Technology)

How Machine Learning Can Transform Digital Asset Management - SmartCrop

In previous articles I discussed the opportunities for machine learning in digital asset management (DAM), and, as a proof of concept, integrated a DAM solution (Adobe AEM DAM with with various AI/ML solutions from Amazon, IBM, Google and Microsoft.

The primary use case for that proof of concept was around auto-tagging assets in a digital asset management solution. Better metadata makes it easier for authors, editors, and other users of the DAM to search for assets, and in some scenarios, the DAM can providing asset recommendations to content authors based on metadata. For example, it’s often important to have a diverse mix of people portrayed on your site. With gender, age, and other metadata attributes as part of the image, diversity can be enforced using asset recommendation or asset usage reports within a DAM or content management system.

Besides object recognition, the various vendors also provide API’s for facial analysis. Amazon AI for example provides a face analysis API, and this post will show how we can tackle a different use case with that service.

SmartCrop

One common use case is the need for one image to be re-used at different sizes. A good example is the need for a small sized profile picture of the CEO for a company overview page as well as a larger version of that picture in the detailed bio page.

Challenge screenshot

Often, cropping at the center of an image often works fine, but it can also result in the wrong area being cropped. Resizing often distorts a picture, and ends up incorporating many irrelevant areas. A number of solutions are out there to deal with this problem, ranging from open source to proprietary vendors. All of them are leveraging different detection algorithms for identifying the area of interest in an image.

Results

Leveraging the Amazon Rekognition Face Analysis API, we can now solve this problem in a very simple way. Using the API, a bounding box for the face can be retrieved, indicating the boundaries of a face in that picture. And with that bounding box, the right area for cropping can be identified. After cropping, any additional resizing can be done with the most relevant area of the image to ensure the image is at the requested size.

Solution screenshot

The result is shown in the image above. The image to the right is the result of leveraging the SmartCrop functionality based on the Face Analysis API. As you can see, it is a significant improvement over the other options. Improvements to this SmartCrop could be done by adding additional margin, or incorporating some of the additional elements retrieved by the Face Analysis API.

The code for this proof of concept is posted on the Razorfish Github account, as part of the https://github.com/razorfish/contentintelligence repository. Obviously, in a real production scenario, additional optimizations should be performed to this proof of concept for overall performance reasons. The Amazon Rekognition API call only needs to take place once per image, and can potentially be done as part of the same auto-tagging workflow highlighted in previous posts, with the bounding box stored as an attribute with the image for later retrieval by the SmartCrop functionality. In addition, the output from the cropping can be cached at a CDN or webserver in front of Adobe AEM.

Conclusion

As this post highlights, old problems can be addressed now in new ways. In this case, it turns a task that often was performed manually in something that can be automated. The availability of many turnkey machine-learning services can provide a start to solve existing problems in a new and very simple manner. It will be interesting to see the developments in the coming year on this front.

written by: Martin Jacobs (GVP, Technology)

How Machine Learning Can Transform Digital Asset Management - Part III

In previous articles I discussed the opportunities for machine learning in digital asset management (DAM), and, as a proof of concept, integrated a DAM solution (Adobe AEM DAM with Google Cloud Vision. I followed up with a post on potential alternatives to Google’s Cloud Vision, including IBM Watson and Microsoft Cognitive Intelligence, and integrated those with Adobe DAM as well.

Of course, Amazon AWS couldn’t stay behind, and at the last AWS re:Invent conference, Amazon announced their set of Artificial Intelligence Services, including natural language understanding (NLU), automatic speech recognition (ASR), visual search and image recognition, text-to-speech (TTS). Obviously, it was now time to perform the integration with the AWS AI services in our proof of concept.

Amazon Rekognition

The first candidate for integration was Amazon Rekognition. Rekognition is a service that can detect objects, scenes, and faces in images and makes it easy to add image analysis to your applications. At this point, it offers 3 core services:

  • Object and scene detection - automatically labels objects, concepts and scenes
  • Facial analysis - analysis of facial attributes (e.g. emotion, gender, glasses, face bounding box)
  • Face comparison - compare faces to see how closely they match

Integration Approach

Google’s API was integrated using a SDK, the IBM and Microsoft API’s were integrated leveraging their standard REST interface. For the Amazon Rekognition integration, the SDK route was taken again, leveraging the AWS Java SDK. Once the SDK has been added to the project, the actual implementation becomes fairly straightforward.

Functionality

From a digital asset management perspective, the previous posts focused on auto-tagging assets to support a content migration process or improve manual efforts performed by DAM users.

The object & scene detection for auto-tagging functioned well with Amazon Rekognition. However, the labels returned are generalized. For example, a picture of the Eiffel tower will be labeled “Tower” instead of recognizing the specific object.

The facial analysis API returns a broad set of attributes, including the location of facial landmarks such as mouth and nose. But it also includes attributes such as emotions and gender, which can be used as tags. These can then be beneficial in digital asset management scenarios such as search and targeting.

Many of the attributes and labels returned by the Rekognition API included a confidence score, indicating the confidence around a certain object detection.

Results screenshot

In the proof of concept, a 75% cut off was used. From the example above, you can see that Female, Smile and Happy have been detected as facial attributes with a higher than 75% confidence.

Summary

The source code and setup instructions for the integration with AWS, as well as Google, Microsoft, and IBM’s solutions can be found on Github in the Razorfish repository.

One thing all the different vendors have in common is that the services are very developer focused, and integrating these services with an application is very straightforward. This makes adoption easy. Hopefully, the objects being recognized will become more detailed and advanced over time, which will improve their applicability even more.

written by: Martin Jacobs (GVP, Technology)

Highlights from AWS re:Invent 2016

Another AWS re:Invent is behind us and it was packed with exciting announcements including the launch of new products, extension of existing services and much more. It was the biggest re:Invent ever with approximately a whopping 32,000 attendees and numerous exhibitors.

The conference was kicked off with a keynote from Andy Jassy, CEO of Amazon Web Services, who presented some impressive growth numbers and announced host of new updates to AWS portfolio of services. Biggest announcements were around new Artificial Intelligence (AI) services called Lex, Rekognition and Polly and data migrations appliances Snowmobile and Snowball Edge. He also launched Amazon Lightsail, which allows developers to setup a virtual private server (VPS) with just a few clicks.

The second keynote, presented by Amazon Web Services CTO Werner Vogels, was more focused on new development tools, Big Data, Security and Mobile services.

Here’s a rundown of the key announcements coming out of re:Invent this year.  

Amazon AI

One of the most significant announcement from Andy Jassy’s keynote was the launch of Amazon Lex, Amazon’s first AI service. Amazon Lex is a service for building conversational interfaces into any application using voice and text. It’s the technology that’s at the heart of the Amazon Alexa platform. This chat bot-friendly service is in preview.

Another AI service launched was Amazon Rekognition. Rekognition allows developers to add image analysis to applications. It can analyze and detect facial features and objects such as cars and furniture. Jassy also announced launch of Amazon Polly, which converts text into speech. Polly is a fully managed service and you can even cache responses making it cost efficient. It is available in 47 voices and 27 languages.  

Internet of Things (IoT)

AWS Greengrass is another interesting service launched at re:Invent. AWS Greengrass lets you run local compute, messaging & data caching for connected devices in a secure way. Greengrass seamlessly extends AWS to devices so they can act locally on the data they generate, while still using the cloud for management, analytics, and durable storage. It allows IoT devices to respond quickly to local events, operate with intermittent connections, and minimize the cost of transmitting IoT data to the cloud.

Data storage and services

Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon’s Simple Storage Service (S3) using SQL. It is a great addition since it allows developers to use standard SQL syntax to query data that’s stored in S3 without setting up the infrastructure for it. This service works with CSV, JSON, log files, delimited files, and more.

Amazon Aurora, cloud-based relational database, now supports PostgreSQL. It’s already compatible with open source standards such as MySQL and MariaDB.

Serverless

AWS Lambda, a serverless computing service, got a couple of updates as well. Amazon announced Lambda@Edge, the new Lambda-based processing model allows you to write code that runs within AWS edge locations. This lightweight request processing logic will handle requests and responses that flow through a CloudFront distribution. It is great for developers who need to automate simple tasks in their CDN deployment so that traffic does not have to be routed back to a server.

Lambda functions now includes support for the Microsoft’s C# programming language. It already supports Node.js, Python and Java. Amazon also unveiled AWS Step Functions as a way to create a visual state machine workflow out of your functions.  

Compute

As is tradition at re:Invent, Amazon announced a series of new core computing capabilities for its cloud. It launched F1 instances that support programmable hardware, R4 memory optimized instances, T2 burstable performance instances, compute-optimized C5 and I/O intensive I3 instances. Andy Jassy also announced Amazon EC2 Elastic GPUs, a way for people to attach GPU resources to EC2 instances. With Elastic GPUs for EC2 you can easily attach low-cost graphics acceleration to current generation EC2 instances.

Another important compute service launched is Amazon Lightsail. It allows developers to launch a virtual private server with just a few clicks. I think it is great addition to the portfolio as it allows small business owner and blogger to host their websites on AWS.  

Migration/ Data Transfer

Expanding on the scope of the Snowball which was launched last year, AWS added Snowball Edge and Snowmobile to the lineup. While Snowball provided 50TB of storage, each Snowball Edge appliance has 100TB of storage and offers more connectivity protocols than the previous version. Now you have also have Snowmobile to meet the needs of the customers with petabytes of data. Snowmobile is a 45-foot container that is delivered to customers on a trailer truck. This secure data truck stores up to 100 PB of data and can help companies move Exabyte of data to AWS in a matter of weeks instead of years. Snowmobile attaches to the clients network and appears as a local, NFS-mounted volume.  

Development tools

Amazon added AWS CodeBuild to the existing suite of developer tools like Code Commit, Code Deploy and Code Pipeline. AWS CodeBuild is a fully managed build service that compiles source code, runs tests, and produces software packages that are ready to deploy. CodeBuild can be a cost effective and scalable alternative to running a dedicated Jenkins instance.

AWS X-Ray helps developers analyze and debug production, distributed applications, such as those built using a micro services architecture. X-Ray provides an end-to-end view of requests as they travel through the application and helps developers identify and troubleshoot the root cause of performance issues and errors. AWS X-Ray is in preview.  

Monitoring Operations and Security

Similar to AWS Services Health Dashboard, AWS now provides a Personal Heath Dashboard. As the name indicates, this dashboard gives you a personalized view into the performance and availability of the AWS services that you are using, along with alerts that are automatically triggered by changes in the health of the services.

DDoS (Distributed Denial of Service) attacks are one very common trouble spot. Amazon new offering is AWS Shield, a DDoS protection service that safeguards web applications running on AWS. AWS Shield provides always-on detection and automatic inline mitigations that minimize application downtime and latency, so there is no need to engage AWS Support to benefit from DDoS protection. It provides DDoS protection at the DNS, CDN, and load balancer tiers and is available in free and premium flavors.  

Big Data and Compute

AWS Batch, a service for automating the deployment of batch processing jobs is released in preview. AWS Batch enables developers, administrators, and users to easily and efficiently run hundreds of thousands of batch computing jobs on AWS. With Batch users have access to the power of the cloud without having to provision, manage, monitor, or maintain clusters. No software to buy or install. AWS Glue a fully managed ETL service that makes it easy to move data between your data stores was also launched.

Mobile Services

Dr. Vogels also launched Amazon Pinpoint, a mobile analytics service. Amazon Pinpoint makes it easy to run targeted campaigns to drive user engagement in mobile apps through the use of targeted push notifications.

AWS refers to re:Invent as an educational event, and they were very successful in achieving this in 2016. You can find the recording of keynote and tech talks on YouTube.

written by: Praveen Modi (Sr Technical Architect)

AEM on AWS whitepaper

For a number of years now, I have been working on solutions that involve Adobe Experience Manager (AEM) and Amazon Web Services (AWS) to deliver digital experiences to consumers. Partly through custom implementations for our clients, and sometimes through Fluent, our turnkey digital marketing platform that combines AEM, AWS and Razorfish Managed Services and DevOps.

In the past year, I had to pleasure of working with AWS on creating a whitepaper that outlines some of our ideas and best practices around deploying on the AWS infrastructure.

The whitepaper delves into a few areas:

Why Use AEM on AWS?

Cloud-based solutions offer numerous advantages over on-premise infrastructure, particularly when it comes to flexibility.

For example, variations in traffic volume are a major challenge for content providers. After all, visitor levels can spike for special events such Black Friday Shopping, the Super Bowl, or other one-time occasions. AWS’s Cloud-based flexible capacity enables you to scale workloads up or down as needed.

Marketers and businesses utilize AEM as the foundation of their digital marketing platforms. Running it on AWS facilitates easy integration with third-party solutions, and make it a complete platform. Blogs, social media, and other auxiliary channels are simple to add and to integrate. In particular, using AEM in conjunction with AWS allows you to combine the API’s from each into powerful custom-configurations uniquely suited to your business needs. As a result, the tools for new features such as mobile delivery, analytics, and managing big data are at your fingertips.

A Look Under the Hood – Architecture, Features, and Services

In most cases, the AEM architecture involves three sets of services:

  1. Author – Used for the creation, management, and layout of the AEM experience.

  2. Publisher – Delivers the content experience to the intended audience. It includes the ability to personalize content and messaging to target audiences.

  3. Dispatcher – This is a caching and/or load-balancing tool that helps you realize a fast and performant experience for the end-user.

The whitepaper details some common configuration options for each service, and how to address that with the different storage options AEM provides. It also outlines some of the security considerations when deploying AEM in a certain configuration.

AWS Tools and Management

Not only does AWS allow you to build the right solution, it provides additional capabilities to make your environment more agile and secure.

Auditing and Security

The paper also highlights a couple of capabilities to make your platform more secure. For example, AWS has audit tools such as AWS Trusted Advisor. This tool automatically inspects your environment and makes recommendations that will help you cut costs, boost performance, improve, reliability, and enhance security. Other recommended tools include Amazon Inspector, which scans for vulnerabilities and deviations from best practices.

Automation through APIs

AWS provides API access to all its services and AEM does the same. This allows for a very clean organization of the integration and deployment process. Used in conjunction with an automated open-source server like Jenkins you can initiate manual, scheduled, or triggered deployments.

You can fully automate the deployment process. However, depending on which data storage options are used, separate arrangements may need to be made for backing up data. Also, policies and procedures for dealing with data loss and recovery need to be considered too.

Additional AWS Services

There are numerous other services and capabilities you can leverage when you use AEM in conjunction with AEM.

One great service is Amazon CloudWatch, which allows you to monitor a variety of performance metrics from a single place.

In addition, the constant stream of new AWS offerings, such as, Amazon’s Elastic File System (which allows you to configure file storage for your servers), provide new options for your platform.

Takeaway

Using Adobe’s Experience Manager in tandem with AWS provides a powerful platform for easily delivering highly immersive and engaging digital experiences. It is a solution that answers the dilemma that many marketers and content providers face – how to deliver an immersive, relevant, and seamless experience for end users from a unified platform, and the whitepaper aims to provide more insight into how to achieve this.

To learn more about running AEM on AWS, you can download and read the full whitepaper here.

written by: Martin Jacobs (GVP, Technology)

Lessons Learned from a Reactive Serverless CMS

Background

As mentioned in previous posts, we are big proponents of reactive architectures at Razorfish.

We also believe architectures using cloud functions — such as AWS Lambda — are part of the future of application development. In this post, we will call them “serverless” architectures because although there are obviously still servers involved, we’re not responsible for managing them anymore.

The relaunch of our technology blog provided the perfect opportunity to test this new architecture. In the paragraphs that follow, I’ll briefly discuss the architecture, followed by a summary of the lessons we learned.

Solution Summary

We architected the solution using Amazon AWS S3, Lambda, Cloudfront, Hugo, and Github. It incorporates an authoring UI, as well as a mechanism to do publishing. The diagram below shows some of the integration mechanisms. For the full technical details of the implementation, visit the earlier post on the Razorfish technology blog.

Learning — Serverless: Development Model

Obviously, development using AWS Lambda is quite different than your standard processes. But there’s good news: A large number of new tools are trying to address this, and we explored a few of them. Some of the most interesting include:

  • Lambda-local. This is a basic command line tool you can use to run the Amazon Lambda function on local machines.
  • Node-Lambda. Similar to Lambda, this tool provides support for deploying a function to AWS.
  • Apex. This large framework can be used to deploy lambda functions, potentially written in additional languages such as Go — which Apex itself is written in. The tool provides support for Terraform to manage AWS resources.
  • Kappa — Another tool for deployment of Lambda functions, using the AWS API for creation of resources.
  • Serverless. An application framework for building applications using AWS Lambda and API Gateway. It tries to streamline the development of a microservices-based application. It creates AWS resources using CloudFormation templates, giving you the benefits of tracking and managing resource creation. It also supports different types of plugins, allowing you to quickly add additional capabilities to an application (e.g., logging). One of the objectives of the tool is to support multiple cloud providers, including Google and Azure Cloud Functions.
  • λ Gordon — Similar to Apex, a solution to create and deploy lambda functions, using CloudFormation to manage these resources, with a solid set of example functions.
  • Zappa. Zappa allows you to deploy Python WSGI applications on AWS Lambda + API Gateway. Django and Flask are examples of WSGI applications that can now be deployed on AWS Lambda using Flask-Zappa or Django-Zappa. In addition to these tools, IDE’s have developed ways to make it easier to create and deploy lambda functions. For example, Visual Studio and Eclipse have tools to make it easier to create, test, and deploy functions.

Lambda-local was the tool of choice for the serverless CMS application created for our blog. Its simplicity is helpful, and one of the unique challenges we faced was the support needed for binaries like Hugo and Libgit2, which required development both on the local machines and on an Amazon EC2 Linux instance.

Learning — Serverless: Execution Model

Although the initial use cases for AWS Lambda and other similar solutions have been styled around executing backend tasks like image resizing, interactive web applications can become an option as well.

For a start, many solutions don’t necessarily need to be a server side web application, and can often be architected as a static using client-side JavaScript for dynamic functionality. So in the AWS scenario, this means a site hosted on S3 or Cloudfront and then integrate with AWS Lambda using the JavaScript SDK or the API gateway — similar to how this was done for the Razorfish blog.

But in case the dynamic element is more complex, there is a great potential for full-featured frameworks like Zappa that allow you to develop interactive web applications that can run on AWS Lambda using common frameworks such as Django and Flask. In my opinion, this is also where AWS can get significant competition from Azure Functions, as Microsoft has an opportunity to create very powerful tools with their Visual Studio solution.

Overall, AWS Lambda is a great fit for many types of applications. The tool significantly simplifies the management of applications; there’s limited need to perform ongoing server monitoring and management that is required with AWS EC2 or AWS Elastic Beanstalk.

On top of that, Lambda is incredibly affordable. As an example, if you required 128MB of memory for your function, executed it 30 million times in one month for 200ms each time, your monthly bill would be $11.63 — which is cheaper than running most EC2 instances.

The Razorfish technology blog architecture is network intensive. It retrieves and uploads content from S3 or Github. With AWS Lambda, you choose the amount of memory you want to allocate to your functions and AWS Lambda allocates proportional CPU power, network bandwidth, and disk I/O. So in this case, an increase in memory was needed to ensure enough bandwidth for the Lambda functions to execute in time.

Learning — Reactive flows

The second goal of the creation of our blog was to apply a reactive architecture. A refresher: Reactive programming is a programming style oriented around data flows and the propagation of change. Its primary style is asynchronous message passing between components of the solution, which ensures loose coupling.

In the blog scenario, this was primarily achieved by using S3 events, Github hooks, and SNS message passing. Some examples:

  • When one Lambda function was finished, an SNS message was published for another Lambda function to execute.
  • Client-side content updates are posted to S3, and the S3 event generated triggered Lambda functions.
  • A Github update posts to SNS, and the SNS triggers a Lambda function.

Overall, this allowed for a very simple architecture. It also makes it very straightforward to test and validate parts of the solution in isolation.

One of the key challenges, however, is rooted in the fact that there are potential scenarios where it becomes difficult to keep track of all different events and resulting messages generated. This can potentially result in loops or cascading results.

The Developer’s Takeaway

Overall, I believe the architectural style of reactive and serverless has a lot of promise — and may even be transformational with respect to developing applications in the future. The benefits are tremendous, and will allow us to really take cloud architectures to the next level. For this reason alone, developers should consider how this approach could be incorporated into every new project.

written by: Martin Jacobs (GVP, Technology)

Sitecore with React and SEO

On a recent Sitecore build-out, the architecture included, among other things, a significant amount of functionality that was provided by a client-side accessed API.  We chose React to connect to and provide the UI for that API.  Since we were already using React for some things, we chose to standardize on React and use it for all of the UI.

The prevailing approach to be found around the web was to write the entire page as a single React component.  The rare article or guide that spoke of using smaller standalone components all suggested supplying the components with their data in the form of JSON.

Let’s take a look at the example of a simple, standalone component on React’s homepage:

On the last line (line 7) above, we see that the property value, subject="World", is being supplied to the component. But let’s assume that "World" is text that is managed by the CMS.  How would we get that text from the CMS to React? Popular thinking suggests outputting all the data you would need as JSON–and passed directly to React.render, inside of a script tag, much as you see on line 22 below.

The problem is, that approach doesn’t provide very good SEO value.

So how can we provide the data to our components in an SEO-friendly way? Interestingly, use one of the technologies introduced to assist search engines to improve the semantic value of the data they crawl–schema.org.

Schema.org is a collaborative, community activity with a mission to create, maintain, and promote schemas for structured data on the Internet, on web pages, in email messages, and beyond.

Schema.org vocabulary can be used with many different encodings, including RDFa, Microdata and JSON-LD. These vocabularies cover entities, relationships between entities and actions, and can easily be extended through a well-documented extension model.

Schema.org markup includes the concepts of itemscope, and itemprop.  We adapted this approach to provide a markup structure and attributes that would readily convert to the JSON-full-of-component-props that we need to render React components.

A simplified version of the final result:

As you can see, data-react="%ReactComponentName%" attribute-value pair identifies a markup structure as a React component.  Inside this structure, data-prop="%propertyName%" attribute-value signifies that the contents of a given html tag represent the value of that property. For instance, data-react="HelloMessage" identifies a markup structure as representing the place in the page where a HelloMessage component should render. And the markup structure also contains data-prop’s to provide the props data for the component. The first HelloMessage has data-prop="greeting" with text contents of "Hello". This is converted to { greeting: "Hello" } before being passed in when rendering the component.

Consider the following markup:

The above markup gets converted to the JavaScript object below:

As an added convenience, the React components are not rendered into some other DOM Node, as in the previous examples. Instead, quite naturally, they are rendered in-place right where they are defined in the markup, using the props defined inside the same markup.

And there we have it, Sitecore with React and SEO.

written by: Dennis Hall (Presentation Layer Architect, Technology)

What the Rise of Cloud Computing Means for Infrastructure

Infrastructure setup and application programming are merging into simultaneous processes. With this critical change, we need to take a fresh look at how we design solutions. If we don’t, our projects risk failure.

Building and installing system infrastructure (think servers and networks) was once an arduous process. Everything had to be planned out and procured, often at high costs and with a long lead time. Often times, server specifications were created before the actual application (and the technologies involved) that would need to run on it had been fully flushed out. The actual application programming task was a whole separate step with little overlap.

That’s no longer the case due the rise of Cloud computing. Infrastructure is now software, and the convenience of that leads to new challenges.

Merging Designs

With Cloud computing, Infrastructure is way more fluid thanks to all the programmable elements. As a result, upfront planning isn’t as important, as cost and especially timelines are not a constraint anymore. Compute, storage and network capacity is immediately accessible and can be changed dynamically to suit any need.

With these changes, the days of separate tracks for application and infrastructure development are over. The once separate design processes for each of them need to merge as well. This is largely driven by 3 factors:

  1. Historically, the separation of application and infrastructure development didn’t work, but it was accepted as a given.
  2. Cloud architectures take on a bigger role than traditional infrastructure
  3. New Architectures create new demands

The Historical Challenge

Performance, availability and scalability have been a challenge forever. Before cloud architectures became standard, vendors have been trying to address these requirements with complex caching architectures, and similar mechanisms. The reality is that none of the products really delivered on this premise out of the box. Obviously, one core challenges was that companies were trying to deliver dynamic experiences on a fixed infrastructure.

But even within that fixed infrastructure, any deployment required exhaustive performance tuning cycles and vendor support, trying to overcome the issue of infrastructure independently designed from the application, with only moderate success.

The Changing Infrastructure Role

Cloud architectures also start to play a bigger role in the overall systems stack. Let’s look at a hypothetical basic Java application with an API build on Amazon Web Services, the most popular cloud computing service, to see what the merger of system infrastructure and application programming looks like.

The application can be developed like any other Java application, but once it comes to how security is addressed, what is specified where?

On the application side, there could be some internal security mechanisms that define what access to services is available. Internal application roles can determine what access to different data elements the service request has. From an infrastructure perspective, Amazon Web Services can also provide security measures (access to ports, another layer of permissions, etc.) that affect how the application API can be accessed by clients. In addition, Amazon’s AWS policies can define which request arrives at the application, or which data elements are available once a request is being serviced.

As this example shows, the application and infrastructure views need to be merged in order to fully understand the security mechanisms available. Just focusing on one side or the other paints an unclear picture.

New Architectures

A number of new architectures have been created now that infrastructure is programmable. Reactive architectures and code executors like Google Cloud Functions and AWS Lambda are examples of these serverless computing services. Once we start using fully dynamic infrastructures for auto-scaling and micro services, the need for in integrated view of both the application and systems becomes even more important.

Finding New Solutions

Handling infrastructure and application development in an integrated manner is difficult.

One of the challenge is that the design tools to visualize this are lacking. Tools like Cloudcraft help in this regard but a fully integrated view is lacking, especially if you start using new architectures like AWS Lambda. Ideally, there’d be a way to visually layer the different perspectives of an architecture in a way that resembles a Photoshop image. Easily looking at an architecture from the perspective of security, services, data flows, and so on would be incredibly useful.

From a process perspective, infrastructure and application have to be handled with the same processes. This includes code management, defect tracking and deployment. This of course has implications on the skills and technology needed to successfully complete a project, and not all organizations are ready for this yet.

Conclusion

These days, infrastructure and application are intertwined, and an application solution that doesn’t address the infrastructure element is incomplete. Focusing on one without the other cannot address the critical requirements around security, performance, scalability, availability and others. It is important to invest in the tools, processes and people to deliver on this.

written by: Martin Jacobs (GVP, Technology)

A reactive serverless cms for the technology blog

Background

At Razorfish, we are big proponents of reactive architectures. Additionally, we believe architectures using cloud functions such as AWS Lambda are part of the future of application development. Our relaunch of the blog was a good occasion to test this out.

Historically, the blog had been hosted on WordPress. Although WordPress is a good solution, we had run into some performance issues. Although there are many good ways to address performance challenges in WordPress, it was a good time to explore a new architecture for the blog, as we weren’t utilizing any WordPress specific features.

We had used static site generators for a while for other initiatives, and looked at these types of solutions to create the new site. We wanted to avoid any running servers, either locally or in the cloud.

Our technology stack ended up as follows:

  • Github – Contains two repositories, a content repository with Hugo based themes, layout and content, and a code repository with all the cms code.

  • AWS Lambda

  • Hugo – Site Generator written in the Go Programming Language

  • AWS S3 – Source and generated sites are stored on S3

  • AWS CloudFront – CDN for delivery of site.

Why Hugo?

There are a large number of site generators available, ranging from Jekyll to Middleman. We explored many of them, and decided on Hugo for a couple of reasons:

  • Speed – Hugo generation happens in seconds

  • Simplicity - Hugo is very easy to install and run. It is a single executable, without any dependencies

  • Structure - Hugo has a flexible structure, allowing you to go beyond blogs.

Architecture

The architecture is outlined below. A number of Lambda functions are responsible for executing the different functions of the CMS. Some of the use of Hugo was derived from http://bezdelev.com/post/hugo-aws-lambda-static-website/. The authentication function was loosely derived from https://github.com/danilop/LambdAuth.

The solution uses AWS lambda capabilities to run executables. This is used for invoking Hugo, but also for incorporating libgit2, which allows us to execute git commands and integrate with Github.

CMS

As part of the solution, a CMS UI was developed to manage content. It allows the author to create new content, upload new assets, and make other changes.

Content is expected to be in Markdown format, but this is simplified for authors with the help of the hallojs editor.

Preview is supported with different breakpoints, including a mobile view.

As it was developed as a reactive architecture, other ways to update content are available:

  • Through a commit on github, potentially using github’s markdown editor.

  • Upload or edit markdown files directly on S3

Challenges

As the solution was architected, a few interesting challenges had to be addressed.

  1. At development, only Node 0.14 was supported on AWS. To utilize solutions like libgit2, a more recent version of Node was needed. To do so, a Node executable was packaged as part of the deploy, and Node 0.14 spawned the more recent Node version.

  2. Only the actual site should be accessible. To prevent preview and other environments from being accessible, CloudFront signed cookies provided a mechanism to prevent the other environments from being directly accessible.

  3. Hugo and libgit are libraries that need to be compiled for the AWS Lambda linux environment, which can be a challenge with all other development occurring on Windows or Macs.

Architecture benefits

The reactive architecture approach makes it really easy to enhance and extend the solution with other options of integrating content or experience features.

For example, as an alternative to the described content editing solutions above, other options can be identified:

  • A headless CMS like Contentful could be added for a richer authoring UI experience.

  • By using SES and Lambda to receive and process the email, an email content creation flow could be setup.

  • A convertor like pandoc on AWS Lambda can be incorporated into the authoring flow, for example for converting source documents to the target markdown format. It possibly can be invoked from the CMS UI, or from the email processor.

From an end-user experience perspective, Disqus or other 3rd party providers are obvious examples to incorporate comments. However, the lambda architecture can also be an option to easily add commenting functionality.

Conclusion

Although more and more tools are coming available, AWS Lambda development and code management can still be a challenge, especially in our scenario with OS specific executables. However, from an architecture perspective, the solution is working very well. It has become very predictive and stable, and allows for a fully hands-off approach on management.

written by: Martin Jacobs (GVP, Technology)

How Machine Learning Can Transform Content Management

In previous posts, I explored the opportunities for machine learning in digital asset management, and, as a proof-of-concept, integrated a DAM solution (Adobe AEM DAM) with a set of machine learning APIs.

But the scope of machine learning extends much further. Machine learning can also have a profoundly positive impact on content management.

In this context, machine learning is usually associated with content delivery. The technology can be used to deliver personalized or targeted content to website visitors and other content consumers. Although this is important, I believe there is another opportunity that stems from incorporating machine learning into the content creation process.

Questions Machine Learning Can Answer

During the content creation process, content can be analyzed by machine learning algorithms to help address some key questions:

  • How does content come across to readers? Which tones resonate the most? What writer is successful with which tone? Tone analysis can help answer that.
  • What topics are covered most frequently? How much duplication exists? Text clustering can help you analyze your overall content repository.
  • What is the article about? Summarization can extract relevant points and topics from a piece of text, potentially helping you create headlines.
  • What are the key topics covered in this article? You can use automatic topic extraction and text classification to create metadata for the article to make it more linkable and findable (both within the content management tool internally and via search engines externally).

Now that we know how machine learning can transform the content creation process, let’s take a look at a specific example of the technology in action.

An Implementation with AEM 6.2 Content Fragments and IBM Watson

Adobe just released a new version of Experience Manager, which I discussed in a previous post. One of the more important features of AEM 6.2 is the concept of “Content Fragments.”

In the past, content was often tied to pages. But Content Fragments allow you to create channel-neutral content with (possibly channel-specific) variations. You can then use these fragments when creating pages.

Content Fragments are treated as assets, which makes them great candidates for applying analysis and classification. Using machine learning, we’re able to analyze the tone of each piece of content. Tones can then be associated with specific pieces of content.

In the implementation, we used IBM Bluemix APIs to perform tone analysis. The Tone Analyzer computes emotional tone (joy, fear, sadness, disgust or anger), social tone (openness, conscientiousness, extraversion, agreeableness or emotional range) and language tone (analytical, confident or tentative).

The Tone Analyzer also provides insight on how the content is coming across to readers. For each sub-tone, it provides a score between 0 and 1. In our implementation, we associated a sub-tone with the metadata only if the score was 0.75 or higher.

The Results of Our Implementation

If you want to take a look, you’ll find the source code and setup instructions for the integration of Content Fragments with IBM Watson Bluemix over on GitHub.

We ran our implementation against the text of Steve Jobs’ 2005 Stanford commencement speech, and the results are shown below. For every sub-tone with a score of 0.75 or higher, a metadata tag was added.

Results

The Takeaway from Our Implementation

Machine learning provides a lot of opportunities during the content creation process. But it also requires authoring and editing tools that seamlessly integrate with these capabilities to really drive adoption and make the use of those machine-learning insights a common practice.

Personally, I can’t wait to analyze my own posts from the last few months using this technology. That analysis, combined with LinkedIn analytics, will enable me to explore how I can improve my own writing and make it more effective.

Isn’t that what every content creator wants?

written by: Martin Jacobs (GVP, Technology)

How Machine Learning Can Transform Digital Asset Management - Part II

A few weeks ago, I discussed the opportunities for machine learning in digital asset management (DAM), and, as a proof of concept, integrated a DAM solution (Adobe AEM DAM) with Google Cloud Vision, a newly released set of APIs for image recognition and classification.

Now, let’s explore some alternatives to Cloud Vision.

IBM Watson

To follow up, we integrated IBM’s offering. As part of BlueMix, IBM actually has two sets of APIs: the AlchemyAPI (acquired in March 2015) and the Visual Recognition API. The ability to train your own custom classifier in the Visual Recognition API is the key difference between the two.

There are a number of APIs within the AlchemyAPI, including a Face Detection/Recognition API and an Image Tagging API. The Face API includes celebrity detection and disambiguation of a particular celebrity (e.g., which Jason Alexander?).

Result sets can provide an age range for the identified people in the image. In a DAM scenario, getting a range instead of a number can be particularly helpful. The ability to create your own customer classifier could be very valuable with respect to creating accurate results for your specific domain. For example, you could create a trained model for your products, and organize your brand assets automatically against this. It would enable to further analyze usage and impact of these assets across new and different dimensions.

Leveraging the APIs was fairly straightforward. Similar to Google, adding the API to an application is simple; just build upon the sample API provided by IBM. It’s worth noting, however, that IBM’s API has a 1MB image size restrictions, somewhat lower than Google and Microsoft’s 4MB limit.

Microsoft Cognitive Services

Microsoft is interesting, especially considering they won the most recent ImageNet Large Scale Visual Recognition Challenge. As part of its Cognitive Services offering, Microsoft released a set of applicable Vision APIs (though they’re still in preview mode). For our purposes, the most relevant APIs are:

  • Computer Vision: This API incorporates an ability to analyze images and derive the appropriate tags with their confidence score. It can detect adult and racy content, and similar to Google’s Cloud Vision API, it has an Optical Character Recognition (OCR) capability that reads text in images. Besides tags, the API can provide English language descriptions of an image — written in complete sentences. It also supports the concepts of models. The first model is celebrity recognition, although we couldn’t get that one to work for straightforward celebrities like Barack Obama and Lionel Messi (it also doesn’t seem to work on the landing page).
  • Emotion: This API uses a facial expression in an image as an input. It returns the confidence level across a set of emotions for each face in relevant images.
  • Face: This API is particularly interesting, as it allows you to perform face recognition within a self-defined group. In a DAM scenario, this could be very relevant. For example, when all product images are shot with a small set of models, it can easily and more accurately classify each image with respect to various models. If an organization has contracts with a small set of celebrities for advertising prints, classification becomes that much more accurate.

The Microsoft APIs are dependent on each other in certain scenarios. For example, the Emotion API leverages the Face API to first identify faces within an image. Similarly, the Computer Vision API and Face API both identify gender and other attributes of people within an image.

Although Microsoft didn’t provide a sample Java API, the REST API is easy to incorporate. The source code and setup instructions for the integration with Google, Microsoft, and IBM’s solution can be found on Github.

Adobe Smart Tags

At the recent Adobe Summit conference, Adobe also announced the use of machine intelligence for smart tagging of assets as a beta capability of their new AEM 6.2 release. According to Adobe, it can automatically tag images with keywords based on:

  • Photo type (macro, portrait, etc.)
  • Popular activities (running, skiing, hiking, etc.)
  • Certain emotions (smiling, crying, etc.)
  • Popular objects (cars, roads, people, etc.)
  • Animals (dogs, cats, bears, etc.)
  • Popular locations (New York City, Paris, San Francisco, etc.)
  • Primary colors (red, blue, green)

There are even more categories for automatic classification, too.

Automatic Tagging Use Cases

In the previous post, I highlighted a couple of key use cases for tagging using machine intelligence in DAM. In particular, I highlighted how tagging can support the content migration process or improve manual efforts performed by DAM users.

Better metadata makes it easier for authors, editors, and other users of the DAM to find content during the creation process. It can also help in providing asset recommendations to content authors. For example, it’s often important to have a diverse mix of people portrayed on your site. With gender, age, and other metadata attributes as part of the image, diversity can be enforced using asset recommendation or asset usage reports within a DAM or content management system.

What’s more, this metadata can also help improve targeting and effectiveness of the actual end-user experience by:

  1. Allowing the image to be selected as targeted content
  2. Using the metadata in an image to ensure relevant ads, content, and assets are presented in context within an asset
  3. Informing site analytics by incorporating image metadata in click tracking and other measurement tools

In addition to these use cases, new scenarios are being created. Microsoft automatically generates captions your photos. Facebook is using machine intelligence to automatically assign alt text to photos uploaded to Facebook, and, in doing so, improve overall accessibility for Facebook users. Obviously, this type of functionality also will also enable Facebook and Microsoft to provide more targeted content and ads to users interacting with specific photos, a win-win. As metadata is used for end-user consumption in these cases, the unique challenge of really needing to support multilingual tagging and descriptions arises, with its own set of challenges.

With companies like Adobe, IBM, Google, and Microsoft pouring a ton of resources into machine learning, expect a lot of changes and improvements in the coming years. Relatively soon, computers will outperform humans in classification and analysis.

As it relates to Digital Asset Management, it remains to be seen precisely what the exact improvements will be. But one thing is certain: Machine learning technology promises a lot of exciting possibilities.

written by: Martin Jacobs (GVP, Technology)

How Machine Learning Can Transform Digital Asset Management

As the use and need for digital assets increase, so too does the cost and complexity of Digital asset management (DAM) — especially in a world where people are adopting devices with screens of all sizes (e.g., desktop, mobile, tablet, etc.).

DAM, however, is a challenge for many organizations. It still involves frequent manual labor, but machine learning is starting to change that.

Machine learning has already given us self-driving cars, speech recognition, effective web searches, and many other benefits over the past decade. But the technology can also play a role in classifying, categorizing, and managing assets in the years to come.

Machine learning can support DAM in areas such as face recognition, image classification, text detection, people recognition, and color analysis, among others. Google PlaNet, for example, can figure out where a photo was taken based on details embedded in it. Google Photos is using it to improve the search experience. Machine learning has already taken a role in image spam detection. Taken together, this all points to the need for DAM tools to start incorporating advanced machine-learning capabilities.

A Practical Test

Recently, Google released its Cloud Vision API. The Google Cloud Vision API enables developers to understand the content of an image by encapsulating powerful machine-learning models in an easy-to-use REST API. It quickly classifies images into thousands of categories (e.g., “sailboat”, “lion”, “Eiffel Tower”, etc.). It detects individual objects and faces within imagines. And it finds and reads printed words contained within images.

For Razorfish, this was a good reason to explore using the Vision API together with a DAM solution, Adobe AEM DAM. The result of the integration can be found on github.

Results screenshot

We leveraged text-detection capabilities, automation classification techniques, and the landmark detection functionality within Google’s API to automatically tag and assign other metadata to assets.

Benefits and Setbacks

Integrating the Vision API provided immediate benefits:

  • Automated text detection can help in extracting text from images, making them easily accessible through search.
  • Automated landmark detection helps in ensuring that the appropriate tags are set on digital assets.
  • Auto-classification can support browse scenarios for finding the right assets.

But there were also some shortcomings. For example, an image of a businesswoman in a white dress was identified as a bride. In other instances, the labels were vague or irrelevant. Though inconvenient, we expect these shortcomings to improve over time as the API improves.

Even with these drawbacks unaddressed, automated detection is still very valuable — particularly in a DAM scenario. Assigning metadata and tags to assets is usually a challenge, and automated tagging can address that. And since tags are used primarily in the authoring environment, false classifications can be manually ignored while appropriate classifications can help surface assets much broader.

The Evolution of DAM Systems

One frequent point in implementing DAM systems is asset migration. I have seen many clients with gigabytes of assets wonder whether to go through the tremendous effort of manually assigning metadata to them.

There’s a quick fix: Auto-classification techniques using machine learning will improve and speed up this process tremendously.

With the benefits around management and migration, machine learning and other intelligence tools will therefore start becoming a key component of DAM systems — similar to how machine learning is already impacting other areas.

Lastly, incorporating machine learning capabilities in DAM solutions will also have architectural implications. Machine intelligence functionality often uses a services-based architecture (similar to the APIs provided by Google) as it requires a significant or complex set of compute resources. As DAM systems start to incorporate them at its core, it will be more difficult for those solutions to support a classic on-premises approach — causing more and more solutions to migrate to a hosted software as-a-service (SaaS) model.

Bottom line? Consider incorporating machine learning into your DAM strategy now, and look at how it can be applied to your digital asset management process.

written by: Martin Jacobs (GVP, Technology)

Diffusing Automation Arguments: The Inevitability of Automation

As mentioned in one of my previous posts, delivering a successful Cloud architecture necessitates the use of automation. Unfortunately, replacing manual tasks with code takes effort, and is therefore not always used. Here are some key arguments against the adoption of automation:

Priority

“We are already on a tight deadline with all the application features that need to be incorporated.”

Automation is critical to the success and longevity of your product. What’s also true, though, is that this is an industry of tight deadlines, stretch goals, and additional features. You might wonder if you have time to automate.

In this case, unit testing is an interesting comparable situation. Often times, unit testing hasn’t always taken priority in the application development process due to time constraints. It has been put off until the end of development phase with a secondary status. However, unit testing has slowly received the priority it deserves, as it has become clear it provides the benefits in the long run.

And as much as testing is important, automation is even more critical. Automation is an actual part of your runtime application, and should be treated at the same level as your code. The features and capabilities for automation should therefore be included in the application/solution backlog and should be given the same treatment as other features and functionality.

Skills

“We don’t have the skills in-house. Even if we were to use a vendor, we wouldn’t be able to maintain it.”

No doubt, automation is a serious challenge. Automation requires a fundamental shift in mindset for organizations around the need to develop these skills. You may remember that in the early days of web development, it took quite some time for front-end development to become a respected and critical role as say database administration. The automation architect will face a similarly arduous battle for the coming years. For any organization that leverages the Cloud and maintains their own technology platforms, it is a critical role that must be filled or grown within the organization.

Time

“It is faster to do it without automation.”

This is often true for the initial setup. However, considering how quickly Cloud architecture continues to evolve, the time gained from a hasty initial setup could quickly be lost in subsequent change management.

With Cloud architectures incorporating more distinct elements, ensuring consistency across environments is virtually impossible without automation. As a result, without automation, the likelihood of generating defects due to environment mismatches increases quickly when your Cloud architecture grows.

Technologies in Use

“The application technologies we use don’t support automation”

As you architect your application, you identify critical non-functional requirements. For example, security and performance are always part of the decision criteria for the overall architecture stack, and if the technologies selected cannot support the level of performance required, you would evaluate alternative options and select and migrate your architecture to the new solution.

The same applies for automation. If automation cannot be supported with the existing technologies, it is necessary to look at alternatives, and evolve your architecture.

Overwhelming Choices

“We are confused by the technology landscape.”

The amount of solutions in the marketplace can certainly feel paralyzing. There’s Ansible, Chef, and PuppetLabs. There are provisioning tools such as AWS Cloud Formation, Heat, Terraform, and Cloudify. Solutions are constantly evolving, and new vendors are always showing up.

It is difficult to make the right choice of technologies. The selection should be made with the same mindset as selecting the enterprise set of programming languages. It requires an evaluation of which is best suited for the organization. Additionally, a combination of these technologies might be the right solution as well. As you embark on applying automation, here are some tips for being successful:

  • Select a set of automation technologies and stick with it. There will always be pressure to explore alternatives, especially with a quickly changing vendor landscape, but it is important to fully understand your selected technologies before looking at alternatives.
  • Start simple. Amazon Elastic Beanstalk or Heroku are great ways to begin to incorporate automation into your application development workflow and understand how it can further drive productivity and quality.
  • Avoid the framework syndrome and focus primarily on building the automation that is needed for your application. Don’t try to build a framework for automation in the enterprise. The landscape is constantly evolving and frameworks quickly become outdated and superseded.

written by: Martin Jacobs (GVP, Technology)

The Cloud and the 100% Automation Rule

Automation and the Cloud go hand-in-hand. Without automation, the Cloud is just classic deployment with rented servers, instead of your own. You’ll need automation if you want to successfully deliver in the Cloud. This was the case early on in the Cloud era, and becomes even more important now.

As Cloud environments evolve and extend, Cloud architectures consist of far more distinct elements than a standarddedicated architecture. With the emergence of new tools like AWS Lambda, which allows you to run code without provisioning servers, these distinctions are becoming even more pronounced.

As we know, manual tasks are tricky. It can be challenging to consistently perform manual tasks correctly due to quickly changing technology and human error. For that reason, 100% automation becomes an important objective. Any deviation from full automation will create additional challenges.

For example, AWS Cloud hosting quickly becomes complex as organizations struggle to choose between many different instance types. You might not know whether you’d be better off using M3, M4 or C3.

Each decision has its own cost implications. Unless you have achieved the 100% automation target, you are often locked into an instance type due to the difficulties and risks of switching to another one, eliminating an opportunity to benefit from getting the optimal cost/performance benefit.

Our automation tools have greatly improved but we still have work to do. Unfortunately, 100% automation is not always possible. Frequently, manual steps are still required. When you do so, ensure that the manual process is automated as much as possible. I’ll highlight it with a couple of examples.

Provisioning

Many tools automate the setup process for provisioning development, test, and production environments.From Cloudformation to Ansible, Chef, and Puppet, many steps can be automated, and as a result are traceable and reproducible. That said, it would be nice to automate the updates to the provisioning stack further.

To start, the provisioning stack is often a static representation of an ideal architecture. But we live in a fast-paced world, and business moves quickly. Making automation work in dynamic environments can be tricky, particularly when infrastructure needs change, new capabilities are launched, or pricing needs to be optimized. Once your largely static architecture is in place, it is hard to keep it evolving to take advantage of new capabilities.

AWS launched a NAT gateway offering recently, eliminating the need for a NAT instance. For the majority of AWS customers, switching to a NAT gateway will improve the reliability of the overall architecture. Unfortunately, it can be difficult to ensure that this switch happens pro-actively.

I would recommend a scheduled review of new provider capabilities for inclusion. If something is needed, a high priority ticket is submitted to ensure that these new capabilities are incorporated with the same priority as code enhancements or defects. If necessary, the provisioning of new environments can be blocked until these tickets are addressed.

Management

Tools that automate environment management also exist. Many Cloud environments can deploy patches and upgrades automatically.

However, commercial or open source products are often deployed in these Cloud environments, and many don’t have the tools to automate the communication of new releases, patches or other updates. Checking for updates becomes a manual process.

To automate the manual process, use a tool like versionista.com to check whether a vendor page lists hotfixes and release updates changes. Similar to the provisioning scenario, if a change gets detected, create a ticket automatically with the right priority, ensuring its implementation.

Optimization

We will start to see real savings once we optimize Cloud infrastructure. However, once the architecture is in place it is challenging to optimize further. This must be a critical core capability for any technology team.

We can optimize development and test environments. Often neglected after a system has launched, we have managed to eliminate manual processes by implementing an automatic shutdown of instances after low usage. The DNS entry for the instance is redirected to the continuous integration environment, allowing testers or developers with the right privileges to restart the instance.

We can also improve upon cost management. A common approach for disaster recovery is to copy data snapshots to another region. However, as the site evolves the data size increases and the disaster recovery process becomes more expensive. How do you track when you should re-architect the process?

Cost management tools like Amazon Cost Explorer focus on products (e.g. EC2, bandwidth), not processes or features. To ensure optimal cost management, you should automatically map the cost data mapped to your processes using tags. Enforce the existence of tags through automated checking, and also automate the processing of the report. This will provide the team with clear indications on where to invest in optimization.

Challenges in Automation

Automation, like anything else, has its challenges. For a Cloud-optimized environment, it is critical to reach for the 100%. If you cannot achieve that, automate the necessary manual processes 100%.

written by: Martin Jacobs (GVP, Technology)

NFC Technologies

With 150 million NFC equipped phones shipped in 2012, and an expected 1b+ million phone equipped with NFC to be shipped by 2016, NFC technology is going to be the next generation solutions for interactive consumer experiences . With companies, such as Google, Samsung and (eventually) Apple, backing NFC technology with chips within their handsets giving consumers a way to use the NFC chips built into their phones, it’s becoming seemingly obvious that this technology is growing rapidly.

NFC Technologies also identified by the Razorfish leadership team as one of the top technologies for 2012 and beyond. NFC technology is about to revolutionize the access control industry to create, use and manage secure identity on NFC-enabled smartphones. NFC will change how people access digital content, connect online content with printed media, social media check-in and boost loyalty program, deliver multiple experience regardless of where you are in the landscape.  Smartphone users will  have a single device that provides physical access to their home and workplace, consume various experiences, secure access to their PCs and corporate networks and many more.

NFC Technologies presentation can be viewed or downloaded from Slideshare

windows 8 Learning & App development considerations.

Windows 8 the New OS was launched on Oct 26 2012. As you all know Razorfish Seattle office been engaged to develop metro apps for the MSFT Retail business group from October 2010 (last year). The entire app is built on total flexible architecture to adapt changing business content and design requirement.  The app is totally re-usable with some level of customization for any other businesses/clients to provide a windows store app experiences to their customer base.

The App itself is a challenging piece which  includes a mixed technology, supporting more than 34+ localization and flexible enough to support any variation of content  and design (to some extent ) for more than 100+ retailers, 5000+ stores worldwide, running 100,000+ PC to start with and  all of this are part of customization without touching the code base.

Presentation: Windows8_AppDevelopment_Presentation_RF10252012V1.0

Razorfish Technology Summit 2012!

Our 2012 Technology Summit is just around the corner!

The event is by invite only, so please reach out to your Razorfish contact, or send a note to techsummit@razorfish.com for the details!

Here’s the agenda so far:

Thursday, June 14th

  • 7:30-8:45am :: Breakfast

  • 9:00-9:30am :: Welcome/Introduction - Ray Velez, Global Chief Technology Officer

  • 9:30-10:15am :: Keynote – Bob Kupbens, VP of Marketing and Digital Commerce, Delta Air Lines

  • 10:15-10:30am :: Break

  • 10:30-11:00am :: OmniChannel Commerce – Paul do Forno, SVP of Multi Channel Commerce and Kristen Flanagan, Senior Product Manager, Oracle

  • 11:00-11:30am :: The Evolution of Platforms – Drew Kurth, CEO, Fluent and Matt Comstock, VP of CIG

  • 11:30-12:00pm :: Emerging Experiences – James Ashley, Presentation Layer Architect and Jarrett Webb, Principal Developer

  • 12:00-1:00pm :: Lunch

  • 1:00-1:30pm :: Do or Die – Clark Kokich, Chairman

  • 1:30-2:00pm :: Developing for Responsive Design – Frederic Welterlin, Senior Presentation Layer Architect

  • 2:00-2:45pm :: Afternoon Keynote – John Mellor, VP Strategy and Business Development, Adobe

  • 2:45-3:00pm :: Break

  • 3:00-3:45pm :: Big Data panel – Moderated by Pradeep Ananthapadmanabhan, CTO of VivaKi’s Nerve Center

» Michael Howard- VP, Marketing, Greenplum » Dwight Merriman, CEO, 10gen » John Coppins, SVP-Product, Kognitio » Charlie Robbins, CEO, Nodejitsu » Florent de Gantes, Product Manager, Google

  • 3:45-4:15pm :: Multichannel Architectures, a Practical Case Study - SpecialK Design Your Plan – Gustav Hoffman, Global Director, Application Solutions, Kellogg; and Martin Jacobs, VP of Technology

  • 4:15-4:45pm :: The Year Ahead in Social Technologies – Rafi Jacoby, Director, Social Technologies

  • 4:45-5:00pm :: Closing - Ray Velez

  • 6:00-8:00pm :: Cocktail Party

Friday, June 15th

(Optional workshops—please RSVP to techsummit@razorfish.com in advance.)

These workshops are designed for groups of 15-20 and will be working sessions; certain workshops require specific software and pre-reads. Please RSVP to receive more info.

  • Workshop A - Scrum for teams: A hands on cross-disciplinary deep dive for how to apply scrum on your projects. – John Ewen, VP of Delivery

  • Workshop B - Razorfish Open Digital Services and Google AppEngine for rapid app development – Stuart Thorne, Experience Director

  • Workshop C - Using Amazon Web Services for rich and automated cloud hosting – Steve Morad (Amazon), Krish Kurrupath, Group Technology Director, and Ke Xu, Senior Technical Architect

  • Workshop D - Working with Rackspace and Adobe CQ to enable and cloud host powerful CMS web experiences - Vasan Sundar, VP of Technology

Apigee and Mashery

There is some pretty cool stuff going on around APIs (application programming interfaces). It’s getting more and more important that you are using APIs to access social graphs and social functionality through API calls to Facebook Connect (now called Facebook for websites) and Twitter API for example. But on the other side of the equation it’s getting more and more imporant for your company to open up your own APIs. Best Buy’s Remix is one of my favorite examples of a company opening up their product catalog so people can build apps on top of the catalog. Think of things like shopping engines or widgets and gadgets for the latest on sale products, etc.

Companies like Apigee and Mashery help insure that you are getting the best performance. Think about it like a caching delivery network for API calls. Some of the caching can be done with Akamai i.e. jSON, but it’s not built for that. Apigee has an offering on top of twtiter for example. Mashery and Apigee are great for exposing your own API’s as well. They can throttle calls to ensure that your application doesn’t fall down if you get a spike in traffic and they can help accelerate delivery to your users through caching. These companies also provide services to manage the community of developers doing things like providing keys for access to the engine, etc. Analytics also start to get interesting. Some have called Apigee’s analytics the Google Aanalytics for apis.

Why Should You Consider SharePoint for External-Facing Sites

SharePoint is a great platform for external-facing sites, either B2C or B2B. SharePoint is primarily known as an Intranet environment that allows non-technical end-users to build new sites in a very short time, with rich collaborative functionality such as document sharing, out-of-the-box integration with search, and easy to use content management.

But SharePoint also can meet the requirements of sites for consumers — sites with rich interactions, branding, personalization, and content targeting as well as integration with back-end transactional systems for ecommerce, internationalization, cross-browser compatibility, and accessibility compliance. And for B2B sites, SharePoint offers rich user management features and powerful end-user customization capability.

The SharePoint front-end Web architecture, site management, and content management features offer many ways to influence the appearance of sites with capabilities that are available to site designers and developers such as site definitions, portal setting, Master Pages, page layout components, and rich Internet application integration.

As you may recall, SharePoint Server 2007 required significant custom coding to integrate all of these elements into a compelling user experience for an external audience. SharePoint 2010 has simplified that integration and made the creation of external-facing sites a simpler effort.

Still, good planning and design certainly pay off when using SharePoint for B2B and B2C sites, but when using SharePoint today, you will find the right features to use to meet your requirements and design specifications, as long as your requirements are well defined. By following good design practices, you will discover that SharePoint is a productive environment with many advantages over competing platforms. For instance:

  • SharePoint is both a portal and a content management system — a rare combination that offers integration between the content management and the UI rendition part of the platform.

  • SharePoint is a site management environment too — which allows you to standardize how sites are built in a multi-site enterprise situation, which is more often the case than not.

  • SharePoint is well integrated with .NET and associated development tools. With SharePoint 2010, the integration with Visual Studio 2010 greatly facilitates the deployment of a SharePoint application.

  • SharePoint offers full search capability through the tight integration with several search solutions: Search Server, Search Server Express, and FAST, the enterprise search engine from Microsoft.

  • SharePoint offers control over search engine optimization.

  • SharePoint has a strong security model.

  • SharePoint is extensible through a rich integration layer, on the front and back-end and through a large set of third party products. There is a rich ecosystem of vendors around SharePoint that make SharePoint a nearly complete functional and operational environment.

  • Finally, SharePoint is the platform with the most publically available documentation, code samples, and guidance; and training in SharePoint is easily accessible.

There are many convincing reasons to consider SharePoint for public facing sites. Razorfish has a wealth of experience building external SharePoint sites for brands like Kraft’s Maxwell House, Kroger, Dell Financial Services, Carnival Cruise Lines, Pfizer, Parsons, and TCDRS.

Please contact us at www.razorfish.com for more information.

How do we define cloud computing?

It’s comes up again. Folks are asking us to define cloud computing and every time we do, we refine it a little more. At times it’s seemed like Cloud Computing became the new web 2.0 as a blanket term for everything:). I actually think we define it similarly to the Wikipedia definition. For us it breaks down into two categories: cloud services and cloud infrastructure.

Cloud services are defined as technologies that provide a virtual service either through and Open API or through a user interface. Examples range from the classic Salesforce.com to cloud email like Gmail or Twitter and the Twitter Open API, and Facebook Connect. There are lots others, and it’s growing at a frantic pace. Open API’s like Facebook Connect and the Twitter API are incredibly powerful for driving traffic and getting your product, brand, and service out there. In the past we would build a social network from scratch for a web site, that would mean custom application development and maintenance, now we use Javascript and REST to interface with Facebook Connect and we are up and running in a fraction of the time it used to take in the past.

Cloud infrastructure is defined as the virtual and physical infrastructure powering web and digital applications. Cloud infrastructure was strongly enabled through technologies like VMWare that made it possible to make one physical server into 10 or more virtual servers. This coupled with low cast storage created an elastic scalable platform to enable us to do things that weren’t feasible using the old cost models. These services are metered and you only pay as you go, which is a drastic departure from the buy a server, manage and drive it all the time whether you use it or not. While it used to take weeks to get a server up and ready now takes minutes and all you need is a credit card. Companies paving the way include Amazon, Microsoft, and Google, with traditional hosting companies like Rackspace, Savvis, Terremark and others also making these infrastructure services available.

We believe the cloud and it’s ability to scale at a lower cost point will enable more innovation like never before.

Detecting CSS transitions support using JavaScript

Progressive enhancement is one of the cornerstones of good web design: You build a solid foundation that supports a broad range of different browsers, and then add features either as you detect support for them or in a way that doesn’t interfere with less capable browsers.

One of the awesome new features that’s in recent versions of Safari, Safari Mobile (iPhone browser), Chrome, the Android browser, and in Palm’s webOS is CSS transitions. Transitions work by smoothly interpolating between two states of a CSS property over time. For example, using the simple style rules below, you could have a link gradually change from yellow to red when the user moves the mouse over it:

a {color: yellow; -webkit-transition: color 1s linear;}
a:hover {color: red;}

In a more complicated example, you could use CSS transitions to slide an element off-screen when a replacement element is introduced, like the way that the “pages” slide off-screen when you click through an iPhone’s contacts.

This introduces a problem: What if you’re using JavaScript to add the new element, and you want to remove the old element after it’s off screen, and you need to support multiple browsers?

You need to be able to detect that the browser supports CSS transitions so that you know not to remove the element until it’s done animating, and so that you know that it’s OK to remove the element right away for browsers that don’t support CSS transitions. Here’s how you can detect support using JavaScript:

var cssTransitionsSupported = false;
(function() {
    var div = document.createElement('div');
    div.innerHTML = '<div style="-webkit-transition:color 1s linear;-moz-transition:color 1s linear;"></div>';
    cssTransitionsSupported = (div.firstChild.style.webkitTransition !== undefined) || (div.firstChild.style.MozTransition !== undefined);
    delete div;
})();

The variable cssTransitionsSupported will be set to true when transitions are supported, or to false when transitions are not supported.

You can then either use the webkitTransitionEnd (or mozTransitionEnd) events to detect when the transition is finished, or else immediately perform the action when transitions aren’t supported.

Technology Predictions for 2010

Razorfish’s Matt Johnson outlined his predictions for content management over at our CMS blog, www.cmsoutlook.com. Many of his predictions will hold true for web technology at large as well. I see traction and opportunities for:

  • Cloud Options:We will see further movement towards cloud solutions, and more vendors providing SaaS alternatives to their existing technologies. It ties into the need for flexibility and agility, and the cost savings are important in the current economic climate.

  • APIs and SOA:Functionality will be shared across many web properties, and the proliferation of mini apps and widgets will content. APIs are becoming a critical element of any succesful solution. This is also driven by the increased complexity of the technology platform. Solutions we now develop frequently incorporate many different technologies and vendors, ranging from targeting and personalization to social capabilities.

  • Open Source:Not only in content management, but in many other areas, Open Source will start to play an important role. Examples are around search, like Solr, or targeting with OpenX. Cloud computing also further drives the expansion of Open Source. As companies are looking to leverage cloud solutions for agility, the licensing complications with commercial solutions will drive further open source usage.

What do you see as additional trends?

Keeping the cloud open

I really like Matt Asay’s article on why we need to focus on keeping the cloud open and less about keeping the operating system open. If you think of the cloud as an ‘array’ of applications and less of a hosting solution it starts to open up the aperture on it’s true potential. Imaging the ability to stitch together applications across the cloud like you can stitch together data. Basically a yahoo pipes for applications not just data.

Reblog this post [with Zemanta]

SharePoint Conference 2009 - Day 3

Day 3!  The whole reason I am at the SharePoint Conference this year is because I am helping our client present their SharePoint case study in one of today’s sessions.

I scheduled some lightweight sessions in the morning, starting with the fun Building Sharepoint Mashups With SharePoint Designer, Bing Maps and REST Services.  This session was really pretty straight forward.  Using a data view web part to retrieve data from an MSN and Twitter RSS and/or REST feeds and then using XSLT to display maps mashup data (Google or Bing).

Before lunch, I went to Best Practices for Implementing Multi-Lingual Solutions on SharePoint 2010 to see what new things has 2010 in store for Variations.  While there are big changes in store for multi-lingual solutions, they are more on the admin/UI side.  The biggest improvement is the performance gains in building the Heirarchy Creation as timer jobs.  From a UI perspective, the chrome is now also localized based on User preferred language selectable from all the language packs installed.  And as much as I shake my head when I hear this from people, SharePoint 2010 DOES NOT TRANSLATE YOUR SITE CONTENT AUTOMAGICALLY!

I met my client for lunch and we proceeded as a group to Breaker E - our session room.  We presented “Kraft: Migration of Consumer Facing Websites to SharePoint” to a roomful of people and a few came up for questions, comments and leads after the session.  We consider it a success!  That was of course the highlight of my day and everything else was just blah after that point ;p  If you missed it, or are interested in watching the video of the presentation, a copy of the deck and a video of the presentation is up and available on the SharePoint Conference site.  You would need to login with your Windows Live ID.

I spent the afternoon going to Developing Social Applications with SharePoint 2010; and Customizing the Visual Studio 2010 SharePoint Deployment Process. In 2010, comments, ratings, my network, RSS feeds all come out of the box.  The social features available in SharePoint 2010 are ok but not good enough yet, IMO.  This is one area where I think the focus is still more in ECM implementations rather than the Internet.  The Manager/Employee methapor just will not work in the real world.  And though, I was told by the Product team that it could be implemented in an Internet scenario, as shown in their Adventureworks demo - I will have to form that opinion once I’ve seen their Adventureworks demo site.  Deployment has indeed been made simpler in VS2010 by being able to compile and deploy from VS2010 to a local SharePoint instance.  But for deploying between environments, and betwen farms - WSPs are still the best way to go.

This evening’s event is Ask The Expert and SharePoint Idol, a Rock Band competition.  I thought for a sec about joining a team but changed my mind.  I had fun watching them though.

SharePoint Conference 2009 - Day 2

The challenge I always have with these conferences is the plethora of choices available to attendees.  I already know what topics I want to focus on:  WCM; Architecting, Developing and Building public facing internet sites, and Social features in 2010.  But even so, there are still time slots where I have narrowed down the choice to 3, and then I have to make the tough decision and hope that I made the right choice.  For the most part, I decided to always go to a 300 or 400 level session, and then just watch the video and the deck online for the 200 sessions I missed.

For the 9am slot, I had to choose between Advanced Web Part Development in VS 2010 and Introduction to Service Applications and Topology.  The architect won over the developer so I went to the Service Applications session. Essentially, 2010 SSP (Shared Service Providers) is replaced by the new Service Applications architecture. You build service applications that can live in a separate Application Server, and you call it from clients, in this case a SharePoint web front end via proxies.  I’m not sure if this is a correct simile, but I kinda liken it to old DCOM architecture. This makes it easier for organizations (and frankly, ISVs) to build Service Applications that can be deployed once and then used in multiple SharePoint web apps, and more, multiple SharePoint farms.

There’s a follow-up session to this about Scaling SharePoint 2010 Topologies for Your Organization, but I skipped that in favor of Overview of SharePoint 2010 Online. SharePoint Online is another product in Microsoft’s “Software as a Service” offerings.  It is essentially a service where Microsoft hosts and manages SharePoint for your organization.  This is part of Microsoft’s Business Productivity Online Suite (BPOS) which also includes Exchange Online, Office Live Meeting, Office Communications Online, Dynamics CRM Online. It is good for small or medium size business but can also be considered for the enterprise in some special cases.  The important thing to note is that this does not have to be an all-or-nothing decision.  SharePoint online is supposed to complement/extend your on premises infrastructure, not necessarily replace it.

In the afternoon, I agonized over Developing SharePoint 2010 Applications with the Client Object Model, Microsoft Virtualization Best Practices for SharePoint but ended up going to Claims Based Identity in SharePoint 2010.  The client object model was really getting a lot of good tweets during and after the session and I see a lot of opportunities there for us to pull SharePoint information via client calls, i.e., Javascript or Silverlight.  The virtualization session focused on Hyper-V so I didn’t feel too bad about missing it. In the Claims Based Identity session, Microsoft introduced their new Identity Framework and explained how it works.  This essentially works like Kerberos where essentially SAML tokens are created.  The good news is that it supports AD, LDAP, SAML.  The bad news is that it doesn’t support OpenID and other standard internet auth schemes/standards… yet.

I wanted to know more about composites and the new Business Connectivity Services (BCS) so I went to Integrating Customer Data with SharePoint Composites, Business Connectivity Services (BCS) and Silverlight.  BCS is one other new thing with 2010 that is interesting.  Allowing SharePoint to create External Content Type that can pull data from external LOB data opens up a lot of possibilities, but most of the demos I’ve seen so far only connects to 1 table.  In the real world, we would be connecting to a more complex table, in a lot of cases - pulling heirarchical data and I wanted to see how this works - more importantly, will it support CRUDQ features.  This session finally demo’d how to connect using a LINQ data source.  Didn’t see the CRUDQ part though, because the demo was read-only data.

For the last session of the day, I chose between Securing SharePoint 2010 for Internet Deployments (400) and SharePoint 2010 Development Best Practices (300).  So of course, I chose the geekier session since security is a hot topic on public facing sites.  However, this is probably one of the more disappointing sessions for me as this was really more targeted towards SP IT Pros than developers.  It is more about hardening your servers and protecting your network.  All these considerations even come default already in Windows 2008.  I probably would have enjoyed the best practices session better even though I was afraid they will be filled with “duh” moments.  I have to check that deck out though, it produced some funny tweets.

Day 2 is also the night of the Conference Party.  This year, the theme is 80’s night at The Beach (Mandalay Bay) with Huey Lewis and the News providing music and entertainment.  Too bad I missed it.

Cloudfront, Amazon's Caching Delivery Network (CDN)

Speed differences between Amazon S3 and CloudF... Image by playerx via Flickr

It’s nice to see Amazon moving into the CDN space with their Cloudfront offering, it seems like the CDN market can definitely use some fresh look at the challenge. It looks like it builds off your usage of Amazon S3 but with an accelerator finding the closest cache server to deliver your content. With this approach it doesn’t seem like a great fit as a CDN for any architecture. The chart on the right is an interesting comparison.

I’ve been intrigued over the last couple of years with Coral Caching. Peer to peer open source caching seems like it’s ripe with opportunity, wouldn’t it be cool if my mediacenter pc, apple tv and other laptops that sit at home idle during the day could be leveraged to help offload servers. I guess it’s a balance of saving power and sleeping or turning off the box vs. using less server power.

This is a diagram of a Wikipedia:Peer-to-Peer ... Image via Wikipedia

Reblog this post [with Zemanta]

Google creates another new language - Noop

Image representing Google as depicted in Crunc... Image via CrunchBase

Google announced a new language called Noop today.  It looks pretty interesting, building on the power of spring and most notably for them, building dependency injection right into the laungage. Here’s some of the highlights. As we look across the projects we are working on it’s clear how important Spring has become to our architectures. It’s nice to see a more formal recognition. It’s always neat to see the impact of ‘side projects’ at Google.

  • Dependency injection built into the language

  • Testability - a seam between every pair of classes

  • Immutability

  • Syntax geared entirely towards readable code

  • Executable documentation that’s never out-of-date

  • Properties, strong typing, and sensible modern standard library

Reblog this post [with Zemanta]

Microsoft talking about a private cloud?

Image representing Microsoft as depicted in Cr... Image via CrunchBase

Just a couple of weeks after Amazon’s announcement of their private cloud offering it looks like Microsoft is starting to open discussions in that direction. What’s interesting about Microsoft’s discussion is that are coming at it from two directions. They are a provider to the data centers, hosting providers and enterprises building these offerings as well as a provider directly to the consumer.

Reblog this post [with Zemanta]

.gov is saving money and time with cloud computing

Cnet reports today on how Vivek Kundra, the US Chief Information Officer (CIO), is pushing for more movement into the clould computing space to help save taxpayer dollars. There are definitely huge savings with clould computing and it’s getting harder and harder for enterprises to ignore. Especially with the recent announcement around Amazon’s Private Cloud, it seems like the enterprise barriers to adoption are slowly eroding away.

I did find Vivek’s assertion here, hard to believe,

_“Using a traditional approach to add scalability and flexibility, he said, it would have taken six months and cost the government $2.5 million a year. But by turning to a cloud computing approach, the upgrade took just a day and cost only $800,000 a year.”_

but not knowing all the details it might real. Six months down to one day, sounds too much like pixie dust to me!

Reblog this post [with Zemanta]

Amazon Advances Cloud Computing with the Private Cloud

Clouds above Pacific. The picture also shows a... Image via Wikipedia

Amazon Advances Cloud Computing with the introduction of a private clould. The economics really are powerful enough to force business to take note. Anecdotally I’ve spoken to several highly functional startup web application using the clould succesfully. WIth the advent of more secure private clouds I don’t see how enterprise can stay away much longer.

Reblog this post [with Zemanta]

Flex 3 vs Silverlight 3: Enterprise Development

With the recent release of Silverlight 3 and Flash 10 / Flex 4, the Flash vs Silverlight debate has been stoked yet again.  The debate has been raging on twitter using the tags #flex #silverlight.  Links to articles are posted almost every day and retweeted endlessly.

The latest two most talked about articles fall squarely on each side.   On the Silverlight front, a blog post from a coder with a lot of experience in .NET and some Flex experience shares his insights on enterprise development.  His experience with C# colors his opinion of Flex development however, and his inexperience with Flex is evident through his omission of development tools such as FDT and Flex frameworks such as Mate.  His biggest arguments revolve around language features (C# is a more robust, full featured language), AS3’s lack of a native decimal type, and the Flex IDE.  In regards to Silverlight’s penetration, the author claims that since Enterprise apps work within a company, it’s easier to get Silverlight installed.   He finishes the blog post with a nod towards Flex: “this particular Flex application is the best looking application I’ve ever seen!”

On the Flex side, Tim Anderson writes about the release of Morgan Stanley’s Matrix application in his blog.  In it he highlights a few points made during the presentation on why Flash / Flex was chosen over Silverlight.  The main points of the article highlight Flash Player 9’s penetration and Silverlight’s lack-there-of, the application’s speed, and allowing the designers on the project to use products they wanted during development.   The most insightful quote of the presentation was regarding the design tools:

“You have to look at the people that use that technology. The design community. That’s the biggest problem that Microsoft has. The designers all carry around Apple laptops, they all use the Photosuite [sic] set of software tools. It’s like asking structural engineers to stop using CAD applications. That’s the tool that they use, and if you can’t convince them to switch away from your software suite you are going to get a limited number of designers that will use Microsoft’s toolset … if you can’t get the designers to switch, to learn a new language, then how can you possibly ever get some traction?”

So there you have it, one article by a seasoned .NET developer decrying Flex’s lack of language features and another decrying Silverlight’s inability to win over designers.

I have also straddled the fence between .NET and Flex developer for a number of years and have worked a little bit with Silverlight, so I tend to agree with both articles.  They are both right.  AS3 is an inferior language and it’s default IDE is definitely no match for VS.NET, however Flex/AS3’s speed isn’t as bad as is made out and it’s a platform that is ubiquitous and has a VERY low barrier of entry for designers and other non-developers.

So this debate boils down to form vs function.  It’s harder to write a large application with a lot of business logic in Flex, however it’s easier to make it look good.  The opposite is true for Silverlight.  So just like a with any other technology, you have to make a choice based on your audience, the design, and the lifetime of the application.

So I’d recommend Silverlight if:

  • your audience is a small and you have control over the environment they are going to use the app in

  • the design isn’t complex (like heavy use of blend modes, interactive 3d elements)

  • you need tight integration with a .NET backend

  • there is a lack of Presentation Layer Developers

I’d recommend Flash if:

  • you are serving a large, diverse audience

  • you have a complex design with animation (3D, webcam integration, etc)

  • your application uses mainly webservices to communicate to a backend

  • sufficient presentation layer development resources

You can duplicate most sites built in Flex in Silverlight and vice-versa (with a few exceptions).  It’s just a matter for the right tool or the right job.  I lean towards Flex because I feel it has the most flexibility (no pun intended), but I do like XAML / WPF / Silverlight and am excited to see it evolve and be a competitor to Adobe.

Everyone wins when there is competition.

Taming IE6 and a "Drop IE6" rebuke

During the development of any project that involves HTML, there’s always a nagging question in the back of your mind:  “How broken will this site be in IE6?“  Here’s an article that will reduce the amount of worrying you do when fixing your site to work in IE6.  It covers the majority of issues you’ll encounter when working with IE6.

Definitive Guide to Taming the IE6 Beast

The article covers:

  • conditional comments

  • target IE6 CSS Hacks

  • Transparent PNG-fix

  • double margin on float

  • clearing floats

  • fixing the box model

  • min/max-width/height

  • overflow issues

  • magic-list-items appearing

It’s probably the last article on IE6 specific CSS techniques you’ll ever need to read.  Required reading for all PLD’s.

On the topic of IE6 and whether or not we should still be supporting it, here are some thoughts.

IE6 support seems to be waning, but we still have plenty of clients that are still running IE6 exclusively on their work machines, so until they upgrade to Windows Vista / 7 we’ll continue to have to support them.

In the past year there have been a few campaigns to get people to upgrade like hey-IT.com, www.bringdownie6.com, and www.end6.org.   Also, Google just announced that YouTube wouldn’t support IE6 anymore in the near future.

Sadly, the more I thought about just saying “no more IE6 support”, the more I realized that the people that were running IE6 at this point couldn’t upgrade.  They are usually either on older machines (Windows 2000 or earlier) or their IT won’t upgrade because of a legacy web-based application depends on it, like a CRM or ERP app.    These applications aren’t upgraded often, and they are definitely not upgraded during a recession.

Full IE6 support is vital for any site that caters to business users (IT issues / older computers), international users (older computers), or a large percentage of the public (lots of people don’t upgrade their computers/OS when all they do is browse the web with them).

Here’s a good chart that shows the trends for various browsers / versions from Oct-04 to May-09 based on data from NetApplications.com

It shows IE6 usage just below Firefox usage in May-09.

As much as I dislike “fixing” the sites I work on to work with IE6, I think we’re going to have to do it at an agency level for another year or so.

Collaboration and Enterprise 2.0

AIIM (www.aiim.org) non-profit organization for enterprise content management (ECM) has released a report on how “Collaboration and Enterprise 2.0” is gaining importance among business.

According to this AIIM report, there has been a dramatic increase in the understanding of how Web 2.0 technologies such as wikis, blogs, forums, and social networks can be used to improve business collaboration and knowledge sharing, with over half of organizations now considering Enterprise 2.0 to be “important” or “very important” to their business goals and success. Business take up of Enterprise 2.0 has doubled in the last year. ** Here are some key findings:**

  • Knowledge-sharing, collaboration and responsiveness are considered the biggest drivers.

  • Lack of understanding, corporate culture and cost are the biggest impediments.

  • 71% agree that it’s easier to locate “knowledge” on the Web than it is to find it on internal systems.

  • 40% feel it is important to have Enterprise 2.0 facilities within their ECM suite, with SharePoint Team Sites as the most likely collaboration platform.

  • Only 29% of organizations are extending their collaboration tools and project sites beyond the firewall.

  • As regards governance of usage and content, only 30% of companies have policies on blogs, forums and social networks, compared to 88% who have policies for email.

  • Whereas almost all companies would not dream of sending out un-approved press releases or web pages, less than 1 in 5 have any sign-off procedures for blogs, forums and even the company’s Wikipedia entry.

  • Planned spending on Enterprise 2.0 projects in the next 12 months is up in all product areas.

About 47% companies opted for SharePoint as a collaboration platform

SharePoint as a Collaboration platform

SharePoint is leading the Enterprise 2.0 revolution by providing a comprehensive business productivity platform that combines traditional collaboration solutions with newer social-computing technologies in an enterprise-capable product. Using rich blog, wiki, RSS, mashup and social-networking solutions combined with the enterprise content management and search capabilities of SharePoint, SharePoint customers are well positioned to deliver real Enterprise 2.0 solutions today.

Companies can use social tool plug-ins like Socialtext, Atlassian Confluence, and Connectbeam (among with many others) to add more advanced SharePoint social features.

More information about SharePoint Server social-computing is available on Microsoft’s website

You can read more SharePoint articles and how-to’s on my SharePoint blog.

SXSW to Go: Creating Razorfish’s iPhone Guide to Austin (Part 3)

Optimization

As the Razorfish Guide to SXSW became more fully developed, we started to look at key areas where we could make performance gains and either actually speed up the site or simply make the site appear to load more quickly. (Check out part 1 of our story to see how requirements for the site were gathered and part 2 to learn about how the site was architected)

Cache it good

One of the earliest steps we took to optimize the application was to use server-side caching. ASP.NET allows you to cache just about anything on the server for quick retrieval. Taking advantage of this feature means that you can avoid extra trips to the database, requests to other services, and repeating other slow or resource-intensive operations. The Razorfish.Web library’s abstraction makes ASP.NET’s caching easy to use, and we quickly added it both to all database calls and to store most MVC models.

Zip it up

A second key optimization was to add GZIP compression to our assets. GZIP compression shrinks the size of most text-based files (like HTML or JSON) down to almost nothing, and makes a huge difference in the amount of time it takes for a slow mobile client to download a response. IIS7 has this feature built in, but we were running the site off of an IIS6 server. Happily, Razorfish.Web.Mvc has an action filter included that supports compressing your responses with GZIP.

Strip out that whitespace

Next, we used Razorfish.Web’s dynamic JavaScript and CSS compression to strip out unnecessary characters and to compact things like variable names. Minifying your scripts and stylesheets reduces their file size dramatically. One of the nice features of Razorfish.Web is that it also can combine multiple files together, reducing the overall number of requests that a client has to make. All of this happens dynamically, so you’re free to work on your files in uncompressed form, and you don’t have to worry about going out of your way to compact and combine files.

Sprites

Another key optimization was combing all of the image assets into a single file, and using CSS background positioning to choose what image to display. Doing this not only cuts the number of requests that have to be made (from 10 to 1, in our case), but also cuts the overall amount of data that needs to be loaded. Each file has its own overhead, and you can cut that overhead by combining them.

Keep it in-line

As we started testing on the actual iPhone, we still weren’t satisfied with the page’s load time. There was a significant delay between the page loading and the scripts loading over the slow EDGE network. This defeated the purpose of the JSON navigation because the user was apt to click a link before the scripts had a chance to load and execute – meaning that they’d have to load a new HTML page. If the scripts were delivered in-line with the page, there would be no additional request, and they could execute right away. Because the successive content was to be loaded with JSON, concerns about caching the scripts and styles separately from the page were moot. We set about extending Razorfish.Web so that it could now insert the combined and compressed contents of script and style files directly into the page. By moving the scripts and styles in-line, we shaved off about 50% of our load time, and the scripts were now executing quickly enough that the JSON navigation mattered again.

Smoke and mirrors

A final touch was to take advantage of Safari Mobile’s CSS animation capabilities. The iPhone supports hardware-accelerated CSS transitions and animations, meaning fast and reliable animation for your pages. We added a yellow-glow effect to buttons when pressed. The glow was not only visually appealing, but its gradual appearance also helped to distract the user for the duration of the load time of the successive content.

Success

The team managed to pull the web application together in time for launch, and the guide was a smashing success. Over the course of SXSW, sxsw.razorfish.com was visited by 2,806 people who spent an average of 10 minutes each on the site, typically viewed about 8 pages, and often came back for second and third visits. The site attracted a large amount of buzz on Twitter and was praised as the go-to guide for the conference.

When designing for mobile, speed is key. All of the components of the site, including the design, need to work together to connect the user to the content as quickly and as efficiently as possible. In such a hyper-focused environment, the user experience, graphic design, and technology need to be unified in supporting a shared goal.

By producing a responsive, reliable, easy-to-use, to-the-point, and locally-flavored guide to the city, the team succeeded in creating a memorable and positive impression of Razorfish at SXSW.

SXSW to Go: Creating Razorfish's iPhone Guide to Austin (Part 2)

Design and Development

Up against a tight deadline, our small team was working fast and furious to create the Razorfish mobile guide to Austin in time for the SXSW Interactive conference. With our technologies determined and all eyes on the iPhone, we set out to bring the guide to life. (Check out part 1 of our story to find out more about how we set requirements and chose technologies)

The meat and potatoes

The guide is content-driven, and we knew that the site wouldn’t be given a second look without strong content to back it up. Our team decided on structuring the site as nesting categories with a design reminiscent of the iPhone’s Contacts application, and breadcrumb navigation (as is found in the iTunes Store).

With the flow determined, the creative director started developing the content categories and soliciting suggestions from the office about their favorite Austin haunts. She enlisted an information architect to assist with writing the site’s content, and they churned out the site’s content over the next several weeks.

Simultaneously, one of our presentation layer developers began work on graphic design, another focused on hosting and infrastructure, and I began working on database and application architecture.

Getting around

The first major issue we tackled when working on the front-end of the site was navigation. We had identified several features that were essential for the guide to perform satisfactorily:

  • Rather than load a new page, new “pages” of data should be loaded as JSON, and then have their HTML constructed on the client-side. JSON is a very compact way of moving data and is easy to support using JavaScript’s eval function. By using JSON to communicate between the server and the content, we avoided the performance hits of loading a larger request, rendering a fresh page, running scripts again, and checking cached components against the server. Those performance issues are often negligible on a PC with fast internet connection and plenty of memory, but on a mobile device, every byte and every request makes a noticeable impact.

  • Data need to be cached on the client whenever possible, and making repeat requests to the server for the same data should be avoided.

  • The browser’s history buttons (Back and Forward) must work, and ideally work without making new requests to the server.

  • The site must be navigable in browsers that cannot properly support AJAX.

To satisfy both the first and last requirements, we were going to have to effectively have two versions of every page running in parallel (a JSON version for AJAX-ready clients and an HTML version for others). Luckily, the MVC framework makes this easy on the server. By properly defining our data model classes, we could either send the model object to a view page for each of the data points to be plugged in and rendered as HTML, or we could directly serialize the model to JSON and send it to the client. To make it easy for the client script to select the right version, all of the JSON page URLs were made identical to the HTML URLs, except with “/Ajax” pre-pended. With this URL scheme in place, JavaScript could simply intercept all hyperlinks on a page, add “/Ajax” to the location, and load a JSON version of the content instead of a whole new page.

To determine when to use JSON and when to use HTML, we did some simple capabilities testing. If window.XMLHttpRequest, the W3C standard AJAX class, exists, then it was safe to use JSON navigation on the client. Incidentally, Internet Explorer and many mobile browsers do not support this object, which greatly simplified later development.

Several JavaScript classes were created to support page rendering: A history class to manage caching and the forward/back buttons, a base page class that would take care of rendering JSON into HTML, and an application class that would manage the interactions between the pages, the history, and the user. A handful of page types were identified, and subclasses were created from the base page for each specialized layout and different data model.

A method called BrowseTo was defined on the application class that would handle all actions associated with the user clicking a link or going to a new URL. _BrowseTo** **_did several things:

  1. Identify the JSON URL (dropping the “http” and the domain, and adding “/Ajax”)

  2. Determining what page class to use to render the JSON data

  3. Checking if there’s already cached data for the URL, and making a request to get the data if there’s not

  4. Instructing the page to render

  5. Instructing the history to add the new page to the list of visited sites

  6. Caching the JSON data from the response in memory if a new request was made

Due to time constraints, we opted to use “dirty-caching” for JSON data. When dirty-caching, you’re storing the JSON object in memory under a key. In this case, the key was the URL. There are a few downsides to this method:

  • Storage isn’t persistent, and only lasts as long as the browser is open on that page

  • You’re using up memory, not disk space, to store data, which could eventually overwhelm the client and cause it to crash

Because the size of the data that we were caching was very small, and dirty-caching is both very fast to implement and universally supported, we used it to temporarily story data. Given more time, we would have taken advantage of the iPhone’s HTML 5 local storage features. On any browser that supports this feature, you can store data in a database on the client. Many web applications take advantage of this feature to provide persistent offline access to content. The downside is that the HTML 5 local storage API is somewhat tricky to implement properly and is currently confined to a select few browsers.

A little bit of history

Forward and back button support comes naturally when you’re loading new pages, but for the JSON version of the site, we implemented a solution based on URL hashes (the # data at the end of a URL). Most browsers will include URL hashes as a state that can be navigated to using the forward and back buttons. By regularly scanning the URL hash, you can update your page when there’s a change and simulate forward/back button support. Our history class was designed to add the “/Ajax” path as the URL hash, making it easy to determine what JSON data to load when the hash changed.

With our navigation system intact, and our creative team churning out new content for the site, we took a step back and started to look at performance. Check back next week, and see how we fine tuned the site to work quickly and responsively on the iPhone.

SXSW to Go: Creating Razorfish’s iPhone Guide to Austin (Part 1)

Once a year, the internet comes to visit Austin, Texas at the South by Southwest Interactive (SXSWi) conference, and, for 2009, the Razorfish Austin office was determined to leave an impression. We ended up making close to 3,000 impressions.

Industry leaders and the web avante-garde converge on Austin for one weekend each year to learn, network, and see the cutting edge of interactive experience and technology. And also to take advantage of any number of open bars. It is a conference, after all.

The Razorfish Austin office typically plays host to a networking event and takes out ad space in the conference guidebook. In 2009, confronted with shrinking budgets in the wake of the global financial crisis, we knew we had to set ourselves apart and do it on the cheap.

iPhone Apps were on everyone’s mind (and would be in every conference-attendee’s pocket), and would prove to be the perfect venue to showcase Razorfish’s skill and Austin’s personality. In late January 2009, Three presentation layer developers and a creative director formed a small team and set out to build an iPhone-ready guide to Austin.

Over this series of articles, I’ll be diving into how we created the Razorfish Guide to SXSW iPhone-optimized web site. Part 1 will deal with requirements gathering and technology choices, part 2 will cover design and development, and part 3 will talk about what we did to optimize the mobile experience.

Requirements

The first thing we did as a team was to sit down and discuss what the guide had to be. Going in, we knew we wanted it to be on the iPhone because of the cachet associated with the device. We also knew that we had a very condensed timeline to work in – we needed to launch in 5-6 weeks, and we all had other projects that required our focus.

To App, or not to App?

One of the first decisions we made was to approach the guide as an iPhone Web App, rather than building an Objective-C compiled application. We knew that we didn’t have a technical resource who already knew Objective-C available and that we would have trouble getting approval and into the App Store in time for our launch. Most importantly, we needed as many people as possible to be able to use the guide, and didn’t have time to create different versions for different devices.

iPhone Web Applications offer not only a way to leverage the iPhone’s impressive graphical capabilities, thanks to Safari mobile’s excellent standards and future CSS support, but also a way to reach other platforms using progressive enhancement (testing for a feature, and then enhancing the experience for clients that support that feature).

Mobile madness

There are dozens, if not hundreds, of mobile browsers out there, with wildly differing interpretations of CSS and JavaScript. Check out Peter-Paul Koch’s CSS and JavaScript mobile compatibility tables if you need convincing. Supporting multiple mobile devices is no cakewalk, especially since many of them have incorrect or misleading user agents.

The iPhone was our target, and some mobile browsers, such as many versions of Opera Mobile, also have relatively good standards support, but what about IE Mobile or Blackberry?

We quickly came to the conclusion that, because of the condensed timeline, we should test in and support Safari Mobile only, however, that the site also needs to be fully usable with no CSS or JavaScript whatsoever. By ensuring this baseline level of functionality, we could be certain that even the Blackberry browser could at least limp across the finish line.

Back to the desktop

Along with choosing mobile browsing platforms to support, we also had to decide for which desktop browsers to design the site. Ordinarily, desktop compatibility testing is dominated by Internet Explorer 6, but this site was geared towards web designers and developers.

That means more people would be visiting the site using Chrome than would be IE6.

IE6 was swiftly kicked to the curb, and we settled on fully supporting Firefox 3, Safari 3 and Chrome, with basic support for Internet Explorer 7. Safari and Chrome support came almost for free, because the two render almost identically to iPhone’s Safari Mobile.

Site be nimble, site be quick

Supporting mobile devices supporting weak signals, slow connections, small screens, bite-sized memory, and users who are on the go. There are a number of factors conspiring against any mobile website, and we knew that we would have to eke every last bit of performance out in order to overcome them.

Limit the chatter

Client interaction with the server not only increases design complexity, but it also increases the size and number of requests. There were several key factors that made us decide to keep forms and complex interactivity out of the site:

  • Applications that use forms have to validate the data, and guard against attacks. This can slow down the experience, and also would require a more in-depth security review.

  • POST requests are slow. Data-heavy responses are slow. Increasing the number of requests involved in typical usage puts a heavier burden on the server and delays the user in getting from point A to point B.

  • Sites that can be customized or that allow the user to log in typically can’t cache data as efficiently, because page data is often sensitive to the user.

To make the site run quickly, launch on time, and be successful in its goals, the application would be focused on being the best guide it could be, and not on integrating your Twitter account and kitchen sink.

Sell the brand

Lastly, the guide had to make Razorfish look good and leave a strong impression of who we are and what we’re all about. If the guide was as informative and fast and easy to use as can be, but didn’t sell our brand, it would be a failure.

Technologies

Based on the requirements we gathered, the team picked familiar development libraries and languages to work with.

XHTML, CSS and JavaScript

These languages should come as no surprise, as they’re integral to all web applications. An important decision that we did make, however, was that no JavaScript or CSS frameworks should be used.

For desktop development, our industry has become increasingly reliant on JavaScript frameworks to smooth out cross-browser wrinkles and speed up our work. Generally, JavaScript frameworks excel at meeting both of those goals.

There are a couple problems when considering a JavaScript framework for mobile development:

  • Frameworks add a lot of bulk to the page. 54 KB for jQuery 1.3 isn’t much on the desktop, where fast internet connections are common, but it’s painful over 2G wireless connections used by many mobile phones (the first iPhone model included).

  • When you’re targeting a single platform (or a standards-compliant platform), a lot of the framework’s code is going to go to waste. Much of the code in JavaScript libraries is for abstracting cross-browser compatibility issues.

  • When you’re targeting multiple mobile platforms, most frameworks aren’t built with mobile in mind, and may be unable to perform properly regardless.

  • iPhone doesn’t cache components that are over 25 KB in size. (Unfortunately, this is when the component is decompressed, so it doesn’t matter if the component is under 25 KB when GZIP compression is used.)

  • The framework’s code has to be executed on the client in order to initialize all of the framework’s components. On slower clients, such as mobile devices, this is a longer delay than you might think, and many of those features probably won’t be used on the site.

In the future, JavaScript frameworks may overcome these challenges, but we resigned ourselves to starting from scratch for this project.

CSS frameworks were out of the question for many of the same reasons.

ASP.NET MVC

The ASP.NET MVC Framework was chosen as our server-side technology primarily because of the team’s familiarity with it. Having just recently used the technology on other projects, it was still fresh in our minds. The MVC framework allows for quick, clean and very functional design that you have a great deal of control over.

Razorfish.Web

We elected to use our internally-developed .NET library that’s specialized for use on web projects. Razorfish.Web has a number of features that made it indispensible for this project, such as dynamic CSS and JavaScript compression. As I’ll cover later, we extended the library while building the guide to push optimization even further.

SQL Server

Microsoft’s database engine was the natural choice to go along with ASP.NET MVC. We used LINQ to SQL to easily communicate with the database from the web server.

With our tools selected, we were ready to start building the site. Come back for part 2 to learn about some key design and development decisions that went into making sxsw.razorfish.com.

agile and pair programming

One of my favorite topics in agile and iterative development is pair programming.The question is can we make it happen more and do we want to try it more? I’ve typically seen it on the smaller and more isolated projects. It’s a fascinating concept and the research, while minimal that I have found, tend to say two developers get more high-quality work done than one independently.

I also found it interesting that it’s a core tenant of education in some circles today. When my wife was getting her master’s in education, pair learning was one of the approaches she was taught. Often it’s three or four, but two works. All her classrooms are broken into small groups and I guess there’s lots of educational research that backs up the fact that students learn more working in small groups than alone. I’ll ask her for some research links.

I ran across an Distributed Agile post today that dug up some more research backing up pair programming. Here’s what the post had to say

“Pairing is the most powerful tool they’ve ever had. Skills don’t matter as much as collaboration and asking questions. Goal for new hires is to get their partner hired. Airlines pair pilots… Lorie Williams at the University of North Carolina did an experiment and found that the paired team produced 15% less lines of code with much better quality”

Reblog this post [with Zemanta]

Native thread support in Ruby's latest version

Official Ruby logo Image via Wikipedia

It looks like Ruby version 1.9.1 supports native threads and fibers. Fibers are a ‘lightweight’ approach when you don’t need full threads. It sounds like fibers are not preemptive, so they have to yield to other fibers as opposed to threads that can run in parallel. Fibers on the other hand startup faster and use less memory.

I know when Java made the leap to support native threads, it seemed like it was a huge accelerator towards greater adoption. It looks like the performance numbers are already coming in and it’s much faster.

Reblog this post [with Zemanta]

micro-blogging = micro-coding

Image representing Twitter as depicted in Crun... Image via CrunchBase

There’s a new service out there, Snipt, for Twitter that enables folks to post code quickly. Basically, go to Snipt, cut and paste your code into the box and you get a small URL that people can go back to. Snipt takes your code and puts it into an image with a short URL. I did a quick search to see if anyone was using and found a couple of twits. Someone pointed out the incorrect usage of the alt image tech, I won’t mention on which site:). Here’s the test I threw up with some C# code. Looks like fun, especially if it helps clean up some code. I guess you can use it for more than code as well.

Tweetdeck is helping to identify how services like these could be more usefull, ell, the search feature on tweetdeck. For example, looking for some code to do x, setup a search in tweetdeck for x, go back to it everyone once in a while and wala, there’s your solution. Well, that’s the theory….

Reblog this post [with Zemanta]

Big CMS news

Image representing Autonomy as depicted in Cru... Image via CrunchBase

Interwoven has been purchased by Autonomy. Interwoven is definitely one of the most common CMS platforms we see at our enterprise 500 clients, interestingly much more than autonomy. Overall, this seems like a great opportunity for the two to grow market share. From a technology architecture perspective, there are a lot of great synergies that could come out of this merger. For example, publish content, update index. Or plug autonomy in to your content repository, interrogate your data and provide insights. I wonder how Metatagger fits in?

Reblog this post [with Zemanta]

live streaming crashes again

Image representing RayV as depicted in CrunchBase Image via CrunchBase

So, during the historic exciting inaugaration yesterday, many of the big folks seemed to ‘melt’ down as I like to put it. Poor quality video, audio not synced, etc. I am sure there were lots of folks rebooting servers to try and keep up, but it just didn’t seem to work. While Akamai and others streamed millions of streams, up to 5.4 million simultaneous viewers per minute, there were still poor experiences. It really feels like we need to do a better job of peer to peer streaming as a more efficient long term solution. Joost was heading down that route, but it sounds like they pulled their peer to peer when they went to the in browser player. I did watch some of the inaugaraion on Joost and it worked great by the way. I also read a mention of a new technology from a company called RayV which looks promising. The premise being, as more people watch the quality increases. Even though the company is called RayV I have no affiliation:)….

Reblog this post [with Zemanta]

CMIS - will it revolutionize the CMS industry?

Last year three major CMS/ECM vendors IBM/Microsoft/ECM came together to propose new standards that will change the CMS landscape the same way SQL 92 did for the database industry.  Content Management Interoperability Services (CMIS) standards cover services that allow interoperability between content stores. These standards cover the three basic areas of Content Management Systems:

  • CMS basic operations - CRUD (Create, Retrieve, Update and delete) services, versioning and workflow

  • Content discovery - query services, search including a SQL like query

  • Domain model – object types, folder hierarchy, document, relationship and access rules

Previous interoperability proposals such as JSR 170283 have not gained traction because they were purely Java based and were too function rich, forcing the vendors to make substantial investments with little or no market driven need.    Another standard WEBDAV, was too simple and relied solely on HTTP protocol, it had no concepts of content types or content relationships.

CMIS supports both a SOAP based interface and REST based interface, the latter is much easier to implement.  Last month EMC, Microsoft, IBM and Alfresco were able to implement a draft CMIS and test it on Sharepoint/Documentum/Filenet and Alfresco.

The proposed CMIS query supports SQL like terms and clauses such as SELECT, FROM, WHERE and CONNECT by clause.  The query can include based terms and clauses based on content metadata and property such as size, date etc.  Example query:

_SELECT * FROM DOCUMENT WHERE ((CONTENT_STREAM_MIMETYPE = ‘MSWORD’) AND (CONTAINS ‘Razorfish’))

This new draft CMIS standard creates a clear firewall between applications and content stores.  It will cut application development and integrations costs, and eliminate time learning vendor specific content access APIs.  Imagine being able to design an application that can access and manipulate content from any content and change the underlying content store by merely changing an entry in some property file.   For the vendors the outlook may be murky initially, it is possible that the number of competing CMS/ECM products may shrink.  Nevertheless, the market penetration of CMS products will increase dramatically and CMS/ECM may be as ubiquitous as databases.  Microsoft’s involvement brings up the possibility that all MS Office products may support direct check in/check out from CMIS based repositories.

The CMIS draft was submitted in September ‘08 to the standards body OASIS for public comment.   It is expected to be approved by middle of 2009.  The draft is also being backed by Oracle and SAP.

Resources

CMIS charter

The draft may be accessed here as a zip.

Ready for Web 3.0/Semantic Web?

When mainstream media starts talking about SemanticWeb, one can infer that it is not just another buzz within research labs.  Recently the magazine The Economist, and BBC online covered this topic.  Early this month Thomson-Reuters announced a service that will help in Semantic Markup. 

SemanticWeb Primer

The term Semantic Web was first used by Sir Tim Berners-Lee, the inventor of World Wide Web, to be “… day-to-day mechanisms of trade, bureaucracy and our daily lives will be handled by machines talking to machines”.    The most significant aspect of semantic web is the ability of machines to understand and derive semantic meaning from the web content.   The term Web 3.0, was introduced in 2006 as a next generation web with emphasis on semantic web technologies.  Though the exact meaning and functionality in Web 3.0 is vague, most experts agree that we can expect Web 3.0 in some form starting in year 2010.

There are two approaches to extract semantic knowledge from web content.  The first involves extensive natural language processing of content, while the second approach places the burden on content publishers to annotate or markup content.  This marked-up content can be processed by search engines, browsers or intelligent agents.  This solution overcomes the shortcomings of natural language processing which tends to be non-deterministic; furthermore determining the meaning depends not only on the written text, but also on information that is not captured in written text.  For instance, an identical statement by Jay Leno or from Secretary Hank Paulson may have a totally different meaning.    

The ultimate goal of web 3.0 to provide intelligent agents that can understand web content , is still a few years away.  Meanwhile, we can start capturing information and start building constructs in our web pages to facilitate search engines and browsers to extract context and data from content.  There are multiple ways of doing semantic markup of web content that is understood by browsers and search engines.    

Semantic Search Engines

On Sept 22, 2008 Yahoo announced that it will be extracting rdfa data from web pages.   This is a major step in improving the quality of search results.  Powerset (recently acquired by Microsoft) is initially allowing semantic searches on content from wikipedia.org, which is a fairly structured content.  While Hakia uses a different approach, it processes unstructured web content to gather semantic knowledge.  This approach is language based and dependent on grammar.

Semantic markup s- RDFa, and microformats

W3C consortium has authored specifications for annotation using RDF an XML based standard, that formalizes all relationships between entities using triples.  A triple is a notation involving a subject, object and a predicate, for example “Paris is the capital of France” the subject being Paris, the predicate is capital, while ‘France’ is the object.  RDFa is an extension to XHTML to support semantic markup that allows RDF triples to be extracted from web content.

Microformats are simpler markups using XHTML and HTML tags which can be easily embedded in web content.  Many popular sites have already started using microformats.  Flickr uses geo for tagging photo locations, hCard and XFN for user profile.  LinkedIn  uses hcard, hResume and XFN on user contacts.

Microformat hCard example in html  and resulting output on browser page.

 Atul Kedar         Atul Kedar      
Avenue a Avenue A | Razorfish
  
      
1440 Broadway
       New York,,        NY       USA   

Atul Kedar Avenue A | Razorfish1440 BroadwayNew York, NY USA
Microformat hCalendar entry example with browser view:

October 16th : September 18th, 2008 Web 3.0 at Sunnyvale, CA

Tags:  SemanticWeb

 

 

 As you notice from the above examples microformats can be added to existing content and are interpreted correctly by the browsers.  There are many more entities that can be semantically tagged such as places, people and organizations.   Some web browser enhancements (Firefox) recognize these microformats and allow you to directly add them to your calendar or contacts by a single click.  

Automated Semantic markup services and tools

Another interesting development is in the area of automatic entity extraction from content, these annotation application or web services are being developed.  Thomson Reuters is now offering a professional service OpenCalais to annotate content. PowerSet is working on towards similar offerings.   These service reduces the need for content authors to painfully go thru the content and manually tag all relationships. Unfortunately, these services are not perfect and need manual crosschecking and edits.  Other similar annotation services or tools are Zementa, SemanticHacker and  Textwise.

Next Steps

As Web 3.0 starts to take shape, it will initially affect the front end designers involved with the web presentation layer, as organizations demand more semantic markup within the content.  In due course , CMS architects will have to update design of data entry forms, design of entity information records in a manner that facilitates semantic markup and removes any duplication of entity data or entity relationships.  Entity data such as author information, people information, addresses, event details, location data, and media licensing details are perfect candidates for new granular storage schemes and data entry forms.

 

 

Serious Security Flaw in Google Chrome

Security expert Aviv Raff discovered a flaw in the newly released Google Chrome browser. He set up a demo of the exploit here. This will download a java file to your desktop if you are using Chrome.

Chrome also has a potentially serious security flaw from the old version of WebKit it is based on. An attacker could easily trick users into launching an executable Java file by combining a flaw in WebKit with a known Java bug and some smart social engineering.

Meanwhile, researcher Rishi Narang disclosed another flaw that causes Chrome to crash just by visiting a malicious link and without user interaction. He setup a Proof of Concept at http://evilfingers.com/advisory/google_chrome_poc.php

This is especially embarrassing for Google as it promoted security in the new browser in its press release and even in the demo video they have on their website.

IE8 Beta 2 Launched

The IE team launched IE8 Beta 2 which can be downloaded at http://www.microsoft.com/ie8.

You can watch videos of IE8 at http://video.msn.com/video.aspx?mkt=en-us&user=-3161786097973413883 and http://www.microsoft.com/windows/internet-explorer/beta/videos.aspx. IE8 is a very developer friendly browser. You can download add-ons for IE8 at http://www.ieaddons.com/. Some of my favorite add ons include Web Slices and Accelerators.

Some cool features of IE8 Beta 2 include color-coded tabbed-browsing and accelerator support. Accelerators are services that you access directly from the webpage in the context of what you’re doing, letting you bookmark, define, email, map and more with a simple selection. Even your search providers are available as Accelerators. Some Accelerators provide previews so that you can view the result without having to leave the current webpage. Clicking on an Accelerator opens a new tab with the full result. You can download accelerators from http://www.ieaddons.com/en/accelerators/

Also, there is better support for when website you are viewing in a tab crashes - now instead of closing the whole IE window along with other tabs open in the same window, only the tab with the crashing website will close!

Microsoft adds new features to .NET

Microsoft has introduced new features in .NET with their Service Pack 1 (SP1) release of .NET Framework 3.5 and Visual Studio 2008.

Most of the stuff included in the service pack releases is new features and functionality rather than bug fixes and updates to existing feature-set. For example, .NET Framework 3.5 SP1 adds a new concept called the .NET Framework Client Profile, which enables an application to be delivered with just what is needed to install and run the app, rather than the whole framework. This can reduce the size of installation files by 86.5 percent, according a Microsoft spokesperson. Other major features in .NET Framework 3.5 SP1 include a 20 to 45 percent improvement in Windows Presentation Foundation (WPF) applications and changes to the Windows Communication Foundation (WCF) to change the way data and services are accessed.

The changes in the Visual Studio 2008 SP1 and .NET Framework 3.5 SP1 are listed here.

Skype announces unlimited long-distance calls

Skype announced unlimited calling last month to over a third of the world’s population with the launch of its new calling subscriptions. The new subscriptions signal the first time Skype has offered a single, monthly flat rate for international calling to landline numbers in 34 countries.

The new subscriptions have no long-term contract. You can make calls whenever you want – at any time of the day, on any day of the week. From today, you can choose from three types of subscription – from unlimited calls to landlines in the country of your choice through to landlines in 34 destination countries worldwide.

However its not true Unlimited calling - all calls are subject to Skype’s fair usage policy which is set at 10,000 minutes per month (which equates to just about 5 hours of calling per day). Calls to premium, non-geographic and other special numbers are excluded.

Developing with LINQ

Once in a while you encounter tools that change the way you work and write your code. Firebug is an obvious example of this. Having worked with .NET 3.5 the last couple of months, I started using a great tool that made C# development much faster. The tool is LINQPad , written by Joseph Albahari. It lets you query your SQL Server database using LINQ. It is very easy to map your database, and immediately start writing code.

LINQPad supports everything in C# 3.0 and Framework 3.5:

  • LINQ to SQL

  • LINQ to Objects

  • LINQ to XML

It has code snippets for pretty much all LINQ constructs.

In the results pane of the LINQPad tool, you can not only see the results of your query, but also the lambda expression it was evaluated into, as well as the resulting SQL that was executed against the database.

Not only does it speed up development, testing your queries for correctness and performance is also much simpler.

C# Query Expressions and 3.0 Features (Book Preview)

Bruce Eckel and coauthor Jaime King have posted a sample of their upcoming book: C# Query Expressions and 3.0 Features

From the authors:

> > It's become more common for authors to offer a few pages or sometimes a chapter of their text to the public as a means of marketing. Our aim is to not only provide a sample, but also a useful stand-alone text. By itself, this sample provides any C# 2.0 programmer a foundation in C# 3.0. > > > > This is intended to be a useful mini-book on its own, not just a teaser: it's 239 pages long and includes 82 exercises and solutions. The full book is filled with many more exercises and solutions. > >

The book covers:

> > > * Extension methods > * Inheritance vs. extension methods > > * Utilities for this book > > * Extended delegates > > * Other rules > > > * Implicitly-typed local variables > > * Automatic properties > > * Implicitly-typed arrays...... > > * Object initializers > > * Collection initializers > > * Anonymous types > > * Lambda expressions > * Func > > > * Query Expressions > > * Basic LINQ > > * Translation > * Degeneracy > > * Chained where clauses > > > * Introduction to Deferred Execution > > * Multiple froms > > * Transparent identifiers > * Iteration Variable Scope > > > * More complex data > > * let clauses > > * Ordering data > > * Grouping data > > * Joining data > > * Nested Queries > * into > > * let clause translations > > * let vs. into > > * joining into > > * Outer joins > > > * Other query operators >

Download the sample here.

Apple's Leopard lasts '30 seconds' in hack contest

Apple’s Leopard has been hacked within 30 seconds using a flaw in Safari, with rival operating systems Ubuntu and Windows Vista so far remaining impenetrable in the CanSecWest PWN to Own competition.

Security firm Independent Security Evaluators (ISE) — the same company that discovered the first iPhone bug last year — has successfully compromised a fully patched Apple MacBook Air at the CanSecWest competition, winning $10,000 as a result.

Charlie Miller, a principal analyst with ISE, said that it took just 30 seconds and was achieved using a previously unknown flaw in Apple’s Web browser Safari.

Competitors in the hacking race were allowed to choose either a Sony laptop running Ubuntu 7.10, a Fujitsu laptop running Vista Ultimate SP1 or a MacBook Air running OS X 10.5.2.

“We could have chosen any of those three but had to make a judgement call on which would be the easiest and decided it would be Leopard,” Miller said.

“Every time I look for [a flaw in Leopard] I find one. I can’t say the same for Linux or Windows. I found the iPhone bug a year ago and that was a Safari bug as well. I’ve also found other bugs in QuickTime.”

RIAs and Content Management

Forrester recently published a report on Rich Internet Applications and Content Management. The report covers some of the key topics in this area such as organic Search Engine Optimization, changes in build and release management, and how to change Content Management to better support RIA. The content for the report came from interviews across to folks managing and building sites with these technologies. Mike Scafidi’s architecture around Search Optimized Flash Architecture (SOFA) gets a mention.

Microsoft partners with social networks for contact data portability

Microsoft has partnered with some of the world’s top social networks on contact data portability. Starting today, Microsoft will be working with Facebook, Bebo, Hi5, Tagged and LinkedIn to exchange functionally-similar Contacts APIs, allowing them to create a safe, secure two-way street for users to move their relationships between their respective services. Along with these collaborations, Microsoft is introducing a new website at www.invite2messenger.net that people can visit to invite their friends from their partner social networks to join their Windows Live Messenger contact list.

The collaborations with Facebook, Bebo, Hi5, LinkedIn and Tagged will make it easier, safer, and more secure for people to have access to their contacts and relationships from more places on the web. These networks will be adopting the Windows Live Contacts API instead of “screen-scraping.”  Starting today, you can visit www.facebook.com and www.bebo.com to find your friends using the Windows Live Contacts API.  Hi5, Tagged and LinkedIn will be live in the coming months.

FCC Closes 700MHz Auction at $19.6B

Bidding in the FCC’s 700MHz auction closed March 18, 2008, after the auction raised a record $19.6 billion over 261 bidding rounds. The winners of the spectrum have not been disclosed as yet by the Federal Communications Commission.The results of this single spectrum auction surpass the $19.1 billion combined total raised by the FCC in 68 other auctions over the last 15 years. The proceeds will be transferred to the U.S. Treasury by June 30, earmarked to support public safety and digital television transition initiatives. The spectrum auction is part of the transition to digital television that will culminate in all television signals switching from analog to digital on Feb. 17, 2009. The FCC also placed conditions on the sale of the C block spectrum, requiring the winning bidder to build an open network to which users can connect any legal device and run the software of their choice.Before the auction began in January, Google committed to meeting the minimum bid in the C block. AT&T and Verizon were also interested in the spectrum. Although the FCC did not say when the winner would be announced, the current speculation is that the FCC will release the information by the end of March or early April.“The open platform will help foster innovation on the edge of the network, while creating more choices and greater freedom for consumers to use the wireless devices and applications of their choice,” FCC Chairman Kevin Martin said in a statement. “A network more open to devices and applications can help ensure that the fruits of innovation on the edges of the network swiftly pass into the hands of consumers.”