Аннотация:Architecture of a combined virtual and materialized environment for integration of heterogeneous data collections is provided. Collections are assumed to contain structured, semi-structured or unstructured data. Combination of virtual and materialized integration is motivated by advantages and disadvantages of both approaches. Virtual integration is supported by subject mediation technology. Materialized integration is provided by Hadoop (open source software framework for storage and distributed processing of large datasets) accompanied by a system implementing relational warehouse over Hadoop (as examples, Hive and Big SQL are considered).